Amarthya7 commited on
Commit
6a51ba5
·
verified ·
1 Parent(s): 51bb50b

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Multi-Modal AI Demo
3
+ emoji: 🤖
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 3.50.2
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # Multi-Modal AI Demo
13
+
14
+ This project demonstrates the use of multi-modal AI capabilities using Hugging Face pretrained models. The application provides the following features:
15
+
16
+ 1. **Image Captioning**: Generate descriptive captions for images
17
+ 2. **Visual Question Answering**: Answer questions about the content of images
18
+ 3. **Sentiment Analysis**: Analyze the sentiment of text inputs
19
+
20
+ ## Requirements
21
+
22
+ - Python 3.8+
23
+ - Dependencies listed in `requirements.txt`
24
+
25
+ ## Installation
26
+
27
+ 1. Clone this repository
28
+ 2. Install dependencies and setup the application:
29
+ ```
30
+ python run.py
31
+ ```
32
+ Then select option 5 to perform full setup (install requirements, fix dependencies, and download sample images)
33
+
34
+ ## Known Issues and Solutions
35
+
36
+ If you encounter errors related to package compatibility (Pydantic, FastAPI, or Gradio errors), use:
37
+ ```
38
+ python fix_dependencies.py
39
+ ```
40
+ This will install compatible versions of all dependencies to ensure the application runs correctly.
41
+
42
+ ## Usage
43
+
44
+ Run the web interface:
45
+ ```
46
+ python app.py
47
+ ```
48
+
49
+ Then open your browser and navigate to the URL shown in the terminal (typically http://127.0.0.1:7860).
50
+
51
+
52
+ ## Models Used
53
+
54
+ This demo uses the following pretrained models from Hugging Face:
55
+ - Image Captioning: `nlpconnect/vit-gpt2-image-captioning`
56
+ - Visual Question Answering: `nlpconnect/vit-gpt2-image-captioning` (simplified)
57
+ - Sentiment Analysis: `distilbert-base-uncased-finetuned-sst-2-english`
58
+