UniquePratham commited on
Commit
5f2abc6
Β·
verified Β·
1 Parent(s): 8c35d87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -150
README.md CHANGED
@@ -1,150 +1,10 @@
1
- # πŸ” DualTextOCRFusion
2
-
3
- **DualTextOCRFusion** is a web-based Optical Character Recognition (OCR) application that allows users to upload images containing both Hindi and English text, extract the text, and search for keywords within the extracted text. The app uses advanced models like **ColPali’s Byaldi + Qwen2-VL** or **General OCR Theory (GOT)** for multilingual text extraction.
4
-
5
- ## Features
6
-
7
- - **Multilingual OCR**: Extract text from images containing both **Hindi** and **English**.
8
- - **Keyword Search**: Search for specific keywords in the extracted text.
9
- - **User-Friendly Interface**: Simple, intuitive interface for easy image uploading and searching.
10
- - **Deployed Online**: Accessible through a live URL for easy use.
11
-
12
- ## Technologies Used
13
-
14
- - **Python**: Backend logic.
15
- - **Streamlit**: For building the web interface.
16
- - **Huggingface Transformers**: For integrating OCR models (Qwen2-VL or GOT).
17
- - **PyTorch**: For deep learning inference.
18
- - **Pytesseract**: Optional OCR engine.
19
- - **OpenCV**: For image preprocessing.
20
-
21
- ## Project Structure
22
-
23
- ```
24
- DualTextOCRFusion/
25
- β”‚
26
- β”œβ”€β”€ app.py # Main Streamlit application
27
- β”œβ”€β”€ ocr.py # Handles OCR extraction using the selected model
28
- β”œβ”€β”€ .gitignore # Files and directories to ignore in Git
29
- β”œβ”€β”€ .streamlit/
30
- β”‚ └── config.toml # Streamlit theme configuration
31
- β”œβ”€β”€ requirements.txt # Dependencies for the project
32
- └── README.md # This file
33
- ```
34
-
35
- ## How to Run Locally
36
-
37
- ### Prerequisites
38
-
39
- - Python 3.8 or above installed on your machine.
40
- - Tesseract installed for using `pytesseract` (optional if using Huggingface models). You can download Tesseract from [here](https://github.com/tesseract-ocr/tesseract).
41
-
42
- ### Steps
43
-
44
- 1. **Clone the Repository**:
45
-
46
- ```bash
47
- git clone https://github.com/yourusername/dual-text-ocr-fusion.git
48
- cd dual-text-ocr-fusion
49
- ```
50
-
51
- 2. **Install Dependencies**:
52
-
53
- Make sure you have the required dependencies by running the following:
54
-
55
- ```bash
56
- pip install -r requirements.txt
57
- ```
58
-
59
- 3. **Run the Application**:
60
-
61
- Start the Streamlit app by running the following command:
62
-
63
- ```bash
64
- streamlit run app.py
65
- ```
66
-
67
- 4. **Open the App**:
68
-
69
- Once the server starts, the app will be available in your browser at:
70
-
71
- ```
72
- http://localhost:8501
73
- ```
74
-
75
- ### Usage
76
-
77
- 1. **Upload an Image**: Upload an image containing Hindi and English text in formats like JPG, JPEG, or PNG.
78
- 2. **View Extracted Text**: The app will extract and display the text from the image.
79
- 3. **Search for Keywords**: Enter any keyword to search within the extracted text.
80
-
81
- ## Deployment
82
-
83
- The app is deployed on **Streamlit Sharing** and can be accessed via the live URL:
84
-
85
- **[Live Application](https://your-app-link.streamlit.app)**
86
-
87
- ## Customization
88
-
89
- ### Changing the OCR Model
90
-
91
- By default, the app uses the **Qwen2-VL** model, but you can switch to the **General OCR Theory (GOT)** model by editing the `ocr.py` file.
92
-
93
- - **For Qwen2-VL**:
94
-
95
- ```python
96
- from ocr import extract_text_byaldi
97
- ```
98
-
99
- - **For General OCR Theory (GOT)**:
100
-
101
- ```python
102
- from ocr import extract_text_got
103
- ```
104
-
105
- ### Custom UI Theme
106
-
107
- You can customize the look and feel of the application by modifying the `.streamlit/config.toml` file. Adjust colors, fonts, and layout options to suit your preferences.
108
-
109
- ## Example Images
110
-
111
- Here are some sample images you can use to test the OCR functionality:
112
-
113
- 1. **Sample 1**: A document with mixed Hindi and English text.
114
- 2. **Sample 2**: An image with only Hindi text for multilingual OCR testing.
115
-
116
- ## Contributing
117
-
118
- If you'd like to contribute to this project, feel free to fork the repository and submit a pull request. Follow these steps:
119
-
120
- 1. Fork the project.
121
- 2. Create a feature branch:
122
-
123
- ```bash
124
- git checkout -b feature-branch
125
- ```
126
-
127
- 3. Commit your changes:
128
-
129
- ```bash
130
- git commit -am 'Add new feature'
131
- ```
132
-
133
- 4. Push to the branch:
134
-
135
- ```bash
136
- git push origin feature-branch
137
- ```
138
-
139
- 5. Open a pull request.
140
-
141
- ## License
142
-
143
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
144
-
145
- ## Credits
146
-
147
- - **Streamlit**: For the easy-to-use web interface.
148
- - **Huggingface Transformers**: For the powerful OCR models.
149
- - **Tesseract**: For optional OCR functionality.
150
- - **ColPali & GOT Models**: For the multilingual OCR support.
 
1
+ ---
2
+ title: "DualTextOCRFusion"
3
+ emoji: "πŸ”"
4
+ colorFrom: gray
5
+ colorTo: blue
6
+ sdk: streamlit
7
+ sdk_version: "{{2.1.1}}"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---