Spaces:

winamnd
/

ocr-llm-test

Running

App Files Files Community

winamnd commited on Feb 16

Commit

37d5823

verified ·

1 Parent(s): 032c5a4

Update README.md

Browse files

Files changed (1) hide show

README.md +97 -0

README.md CHANGED Viewed

@@ -11,3 +11,100 @@ short_description: Technical Assessment
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# OCR LLM Classifier
+This project provides a simple interface for Optical Character Recognition (OCR) and spam classification using deep learning models. It supports three OCR methods (PaddleOCR, EasyOCR, and KerasOCR) and uses a DistilBERT model for classifying the extracted text as "Spam" or "Not Spam."
+## Features
+- Extract text from images using OCR.
+- Classify extracted text as either "Spam" or "Not Spam."
+- Save the extracted text and classification results to a local JSON and CSV file.
+## How It Works
+1. **OCR**: The app uses one of the three OCR methods to extract text from the uploaded image:
+   - **PaddleOCR**
+   - **EasyOCR**
+   - **KerasOCR**
+2. **Classification**: The extracted text is passed to a pre-trained DistilBERT model that classifies the text as either "Spam" or "Not Spam."
+3. **Save Results**: The extracted text and classification results are saved locally in both JSON and CSV formats, allowing easy retrieval and review.
+## Installation
+To get started with this project, follow these steps:
+### 1. Clone the Repository
+```bash
+git clone https://github.com/yourusername/ocr-llm-test.git
+cd ocr-llm-test
+```
+### 2. Install Dependencies
+You can install the required dependencies using pip:
+```bash
+pip install -r requirements.txt
+```
+### 3. Run the App
+To run the Gradio interface locally, execute:
+```bash
+python app.py
+```
+Once the app is running, it will be accessible through your web browser at [http://localhost:7860](http://localhost:7860).
+## API Documentation
+### 1. API Endpoint
+The main endpoint for this API is `/predict`.
+### 2. API Call Example
+#### Install the Python Client
+If you don't already have it installed, run the following command:
+```bash
+pip install gradio_client
+```
+#### Make an API Call
+```python
+from gradio_client import Client, handle_file
+client = Client("winamnd/ocr-llm-test")
+result = client.predict(
+    method="PaddleOCR",
+    img=handle_file('https://raw.githubusercontent.com/gradio-app/gradio/main/test/test_files/bus.png'),
+    api_name="/predict"
+)
+print(result)
+```
+### 3. Parameters
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `method` | `Literal['PaddleOCR', 'EasyOCR', 'KerasOCR']` | Choose the OCR method to be used for text extraction. Default is "PaddleOCR." |
+| `img` | `dict` | The image input, which can be provided as a URL, path, or base64 encoded image. |
+#### Image Input Details
+- **path**: Path to a local file.
+- **url**: Publicly available URL for the image.
+- **size**: The size of the image (in bytes).
+- **orig_name**: Original filename.
+- **mime_type**: MIME type of the image.
+- **is_stream**: Always set to False.
+- **meta**: Metadata.
+### 4. Returns
+The API returns a tuple with two elements:
+- **Extracted Text (`str`)**: The text extracted from the image.
+- **Spam Classification (`str`)**: The classification result ("Spam" or "Not Spam").