Copy from mknolan/internvl25-image-analyzer
Browse files
README.md
CHANGED
@@ -1,10 +1,58 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
-
sdk:
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: InternVL2.5 Image Analyzer
|
3 |
+
emoji: 🖼️
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: purple
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 3.50.0
|
8 |
+
app_file: app.py
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
+
# InternVL2.5 Image Analyzer
|
13 |
+
|
14 |
+
This Hugging Face Space demonstrates the capabilities of the [InternVL2.5 model](https://huggingface.co/OpenGVLab/InternVL2_5-8B), a powerful multimodal model that can analyze images and respond to questions about them.
|
15 |
+
|
16 |
+
## Features
|
17 |
+
|
18 |
+
- Upload your own images for analysis
|
19 |
+
- Choose from predefined prompts or create your own
|
20 |
+
- Detailed image understanding and description
|
21 |
+
- Text recognition in images
|
22 |
+
- Visual reasoning capabilities
|
23 |
+
|
24 |
+
## Model Details
|
25 |
+
|
26 |
+
This space uses the InternVL2.5-8B model, which is a multimodal large language model (MLLM) with approximately 8.1 billion parameters. The model was developed by OpenGVLab and demonstrates strong capabilities in various visual understanding tasks.
|
27 |
+
|
28 |
+
### Architecture
|
29 |
+
|
30 |
+
InternVL2.5 combines a vision encoder (based on the InternViT architecture) with a language model, allowing it to process both visual and textual information.
|
31 |
+
|
32 |
+
## Example Prompts
|
33 |
+
|
34 |
+
Here are some prompts you can try:
|
35 |
+
|
36 |
+
1. Describe this image in detail.
|
37 |
+
2. What can you tell me about this image?
|
38 |
+
3. Is there any text in this image? If so, can you read it?
|
39 |
+
4. What is the main subject of this image?
|
40 |
+
5. What emotions or feelings does this image convey?
|
41 |
+
6. Describe the composition and visual elements of this image.
|
42 |
+
7. Summarize what you see in this image in one paragraph.
|
43 |
+
|
44 |
+
## Usage
|
45 |
+
|
46 |
+
1. Upload an image using the file uploader
|
47 |
+
2. Select a prompt from the dropdown or write your own
|
48 |
+
3. Click "Submit" to get the analysis
|
49 |
+
|
50 |
+
## Credits
|
51 |
+
|
52 |
+
This application uses the InternVL2.5 model by OpenGVLab. For more information about the model, check out:
|
53 |
+
- [OpenGVLab/InternVL Repository](https://github.com/OpenGVLab/InternVL)
|
54 |
+
- [InternVL Documentation](https://internvl.readthedocs.io/en/latest/)
|
55 |
+
|
56 |
+
## License
|
57 |
+
|
58 |
+
The InternVL2.5 model is licensed under the MIT License.
|