Add pipeline tag, library name, project page and MIT License
Browse filesThis PR adds the `image-feature-extraction` pipeline tag and `transformers` library name to the model card to improve discoverability. It also adds a link to the project page and updates the license to MIT.
README.md
CHANGED
@@ -1,21 +1,25 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
tags:
|
4 |
- Vision
|
5 |
- Multi-model
|
6 |
- Vision-Language
|
7 |
- Remote-sensing
|
8 |
widget:
|
9 |
-
- src:
|
10 |
-
https://huggingface.co/datasets/mishig/sample_images/resolve/main/cat-dog-music.png
|
11 |
candidate_labels: playing music, playing sports
|
12 |
example_title: Cat & Dog
|
|
|
|
|
13 |
---
|
14 |
|
15 |
# Git-RSCLIP-base
|
16 |
|
17 |
[[Git-RSCLIP]](https://arxiv.org/pdf/2501.00895) is pre-trained on the Git-10M dataset (a global-scale remote sensing image-text pair dataset, consisting of 10 million image-text pairs) at size 256x256, first released in [this repository](https://github.com/chen-yang-liu/Text2Earth). It employs a similar structure to [[google/siglip-base-patch16-224](https://huggingface.co/google/siglip-base-patch16-224)].
|
18 |
|
|
|
|
|
|
|
19 |
This is a **base version**, the **large version** is here: [[**Git-RSCLIP-large**](https://huggingface.co/lcybuaa/Git-RSCLIP)]
|
20 |
|
21 |
## Intended uses & limitations
|
@@ -101,4 +105,7 @@ Texts are tokenized and padded to the same length (64 tokens).
|
|
101 |
primaryClass={cs.CV},
|
102 |
url={https://arxiv.org/abs/2501.00895},
|
103 |
}
|
104 |
-
```
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
tags:
|
4 |
- Vision
|
5 |
- Multi-model
|
6 |
- Vision-Language
|
7 |
- Remote-sensing
|
8 |
widget:
|
9 |
+
- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/cat-dog-music.png
|
|
|
10 |
candidate_labels: playing music, playing sports
|
11 |
example_title: Cat & Dog
|
12 |
+
pipeline_tag: image-feature-extraction
|
13 |
+
library_name: transformers
|
14 |
---
|
15 |
|
16 |
# Git-RSCLIP-base
|
17 |
|
18 |
[[Git-RSCLIP]](https://arxiv.org/pdf/2501.00895) is pre-trained on the Git-10M dataset (a global-scale remote sensing image-text pair dataset, consisting of 10 million image-text pairs) at size 256x256, first released in [this repository](https://github.com/chen-yang-liu/Text2Earth). It employs a similar structure to [[google/siglip-base-patch16-224](https://huggingface.co/google/siglip-base-patch16-224)].
|
19 |
|
20 |
+
Project page: https://chen-yang-liu.github.io/Text2Earth/
|
21 |
+
Code: https://github.com/Chen-Yang-Liu/Text2Earth
|
22 |
+
|
23 |
This is a **base version**, the **large version** is here: [[**Git-RSCLIP-large**](https://huggingface.co/lcybuaa/Git-RSCLIP)]
|
24 |
|
25 |
## Intended uses & limitations
|
|
|
105 |
primaryClass={cs.CV},
|
106 |
url={https://arxiv.org/abs/2501.00895},
|
107 |
}
|
108 |
+
```
|
109 |
+
|
110 |
+
## License
|
111 |
+
This repo is distributed under [MIT License](https://github.com/Chen-Yang-Liu/Change-Agent/blob/main/LICENSE.txt). The code can be used for academic purposes only.
|