mknolan commited on
Commit
8e6ddeb
·
verified ·
1 Parent(s): 565626f

Copy from mknolan/internvl25-image-analyzer

Browse files
Files changed (1) hide show
  1. README.md +54 -6
README.md CHANGED
@@ -1,10 +1,58 @@
1
  ---
2
- title: Internvl25 Image Analyzer Clean
3
- emoji: 👀
4
- colorFrom: red
5
- colorTo: yellow
6
- sdk: docker
 
 
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: InternVL2.5 Image Analyzer
3
+ emoji: 🖼️
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 3.50.0
8
+ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ # InternVL2.5 Image Analyzer
13
+
14
+ This Hugging Face Space demonstrates the capabilities of the [InternVL2.5 model](https://huggingface.co/OpenGVLab/InternVL2_5-8B), a powerful multimodal model that can analyze images and respond to questions about them.
15
+
16
+ ## Features
17
+
18
+ - Upload your own images for analysis
19
+ - Choose from predefined prompts or create your own
20
+ - Detailed image understanding and description
21
+ - Text recognition in images
22
+ - Visual reasoning capabilities
23
+
24
+ ## Model Details
25
+
26
+ This space uses the InternVL2.5-8B model, which is a multimodal large language model (MLLM) with approximately 8.1 billion parameters. The model was developed by OpenGVLab and demonstrates strong capabilities in various visual understanding tasks.
27
+
28
+ ### Architecture
29
+
30
+ InternVL2.5 combines a vision encoder (based on the InternViT architecture) with a language model, allowing it to process both visual and textual information.
31
+
32
+ ## Example Prompts
33
+
34
+ Here are some prompts you can try:
35
+
36
+ 1. Describe this image in detail.
37
+ 2. What can you tell me about this image?
38
+ 3. Is there any text in this image? If so, can you read it?
39
+ 4. What is the main subject of this image?
40
+ 5. What emotions or feelings does this image convey?
41
+ 6. Describe the composition and visual elements of this image.
42
+ 7. Summarize what you see in this image in one paragraph.
43
+
44
+ ## Usage
45
+
46
+ 1. Upload an image using the file uploader
47
+ 2. Select a prompt from the dropdown or write your own
48
+ 3. Click "Submit" to get the analysis
49
+
50
+ ## Credits
51
+
52
+ This application uses the InternVL2.5 model by OpenGVLab. For more information about the model, check out:
53
+ - [OpenGVLab/InternVL Repository](https://github.com/OpenGVLab/InternVL)
54
+ - [InternVL Documentation](https://internvl.readthedocs.io/en/latest/)
55
+
56
+ ## License
57
+
58
+ The InternVL2.5 model is licensed under the MIT License.