Spaces:
Sleeping
Sleeping
Commit
·
cd3d1c6
1
Parent(s):
423a25e
Sync from GitHub
Browse files- .huggingface.yaml +7 -0
- README.md +184 -73
- app.py +453 -295
- hf_space/README.md +276 -13
- hf_space/hf_space/.github/workflows/docker-build-push.yml +26 -0
- hf_space/hf_space/.github/workflows/hf-space-sync.yml +30 -0
- hf_space/hf_space/Dockerfile +13 -0
- hf_space/hf_space/LICENSE +21 -0
- hf_space/hf_space/app.py +384 -0
- hf_space/hf_space/hf_space/.gitattributes +35 -0
- hf_space/hf_space/hf_space/README.md +13 -0
- hf_space/hf_space/requirements.txt +8 -0
- requirements.txt +4 -1
.huggingface.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
sdk: gradio
|
2 |
+
python_version: 3.11
|
3 |
+
app_file: app.py
|
4 |
+
title: Object Detection App
|
5 |
+
subtitle: Real-time object detection in images using Gradio
|
6 |
+
hardware: cpu-basic
|
7 |
+
license: mit
|
README.md
CHANGED
@@ -1,128 +1,209 @@
|
|
|
|
1 |
|
2 |
-
|
3 |
|
4 |
-
|
5 |
|
6 |
-
|
7 |
|
8 |
-
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
-
|
13 |
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
|
20 |
|
21 |
-
|
22 |
-
* **hustvl/yolos-base**: Strikes a balance between speed and accuracy.
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
-
* **URL Input**: Input an image URL for detection through Gradio or the FastAPI.
|
28 |
-
* **Model Selection**: Choose between DETR and YOLOS models for object detection or segmentation.
|
29 |
-
* **Object Detection**: Detects objects and shows bounding boxes with confidence scores.
|
30 |
-
* **Panoptic Segmentation**: Available with certain models for detailed scene segmentation using colored masks.
|
31 |
-
* **Image Info**: Displays metadata such as size, format, aspect ratio, and file size.
|
32 |
-
* **API Access**: Use the FastAPI `/detect` endpoint for programmatic processing.
|
33 |
|
34 |
-
|
|
|
|
|
35 |
|
36 |
-
|
37 |
|
38 |
-
|
|
|
|
|
|
|
39 |
|
40 |
-
|
41 |
-
* `pip` to install dependencies
|
42 |
|
43 |
-
|
44 |
|
45 |
```bash
|
46 |
-
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
47 |
-
cd ObjectDetection
|
48 |
pip install -r requirements.txt
|
49 |
```
|
50 |
|
51 |
-
#### Run the Application
|
52 |
|
53 |
-
|
54 |
|
55 |
```bash
|
56 |
-
|
57 |
```
|
58 |
|
59 |
-
|
60 |
|
61 |
```bash
|
62 |
-
python app.py
|
63 |
```
|
64 |
|
65 |
-
#### Access the Application
|
66 |
|
67 |
-
|
68 |
-
|
69 |
|
70 |
-
### 2. **Running
|
71 |
|
72 |
-
|
73 |
|
74 |
-
|
75 |
|
76 |
-
|
77 |
|
78 |
-
|
|
|
|
|
79 |
|
80 |
```bash
|
81 |
-
|
82 |
-
cd ObjectDetection
|
83 |
```
|
84 |
|
85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
|
87 |
```bash
|
88 |
-
docker build -t objectdetection:
|
89 |
```
|
90 |
|
91 |
3. Run the container:
|
92 |
|
93 |
```bash
|
94 |
-
docker run -p
|
95 |
```
|
96 |
|
97 |
-
Access the
|
98 |
|
99 |
-
### 3. **
|
100 |
|
101 |
-
|
102 |
-
[**Object Detection Demo**](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
|
103 |
|
104 |
-
|
105 |
|
106 |
-
|
107 |
|
108 |
-
|
109 |
|
110 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
111 |
|
112 |
-
|
113 |
-
* **image\_url** *(optional)*: URL of the image.
|
114 |
-
* **model\_name** *(optional)*: Choose the model (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.).
|
115 |
|
116 |
-
|
|
|
|
|
117 |
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
|
123 |
```
|
124 |
|
125 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
126 |
|
127 |
```json
|
128 |
{
|
@@ -134,14 +215,20 @@ Example Response:
|
|
134 |
}
|
135 |
```
|
136 |
|
137 |
-
|
|
|
|
|
|
|
|
|
138 |
|
139 |
-
|
|
|
|
|
140 |
|
141 |
1. Clone the repository:
|
142 |
|
143 |
```bash
|
144 |
-
git clone https://github.com/
|
145 |
cd ObjectDetection
|
146 |
```
|
147 |
|
@@ -151,15 +238,39 @@ cd ObjectDetection
|
|
151 |
pip install -r requirements.txt
|
152 |
```
|
153 |
|
154 |
-
3. Run
|
155 |
|
156 |
-
|
157 |
-
|
|
|
158 |
|
159 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
160 |
|
161 |
## Contributing
|
162 |
|
163 |
-
Contributions are
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
165 |
|
|
|
|
1 |
+
# 🚀 Object Detection with Transformer Models
|
2 |
|
3 |
+
This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
|
4 |
|
5 |
+
Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/JaishnaCodz/ObjectDetection).
|
6 |
|
7 |
+
## Models Supported
|
8 |
|
9 |
+
The application supports the following models, each tailored for specific detection or segmentation tasks:
|
10 |
|
11 |
+
- **DETR (DEtection TRansformer)**:
|
12 |
+
- `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
|
13 |
+
- `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
|
14 |
+
- `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
|
15 |
+
- `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
|
16 |
+
|
17 |
+
- **YOLOS (You Only Look One-level Series)**:
|
18 |
+
- `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
|
19 |
+
- `hustvl/yolos-base`: Balances speed and accuracy for object detection.
|
20 |
|
21 |
+
## Features
|
22 |
|
23 |
+
- **Image Upload**: Upload images via the Gradio interface for object detection.
|
24 |
+
- **URL Input**: Provide image URLs for detection through the Gradio interface or API.
|
25 |
+
- **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
|
26 |
+
- **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
|
27 |
+
- **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
|
28 |
+
- **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
|
29 |
+
- **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
|
30 |
+
- **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
|
31 |
|
32 |
+
## How to Use
|
33 |
|
34 |
+
### 1. **Local Setup (Git Clone)**
|
|
|
35 |
|
36 |
+
Follow these steps to set up the application locally:
|
37 |
|
38 |
+
#### Prerequisites
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
|
40 |
+
- Python 3.8 or higher
|
41 |
+
- `pip` for installing dependencies
|
42 |
+
- Git for cloning the repository
|
43 |
|
44 |
+
#### Clone the Repository
|
45 |
|
46 |
+
```bash
|
47 |
+
git clone https://github.com/JaishnaCodz/ObjectDetection
|
48 |
+
cd ObjectDetection
|
49 |
+
```
|
50 |
|
51 |
+
#### Install Dependencies
|
|
|
52 |
|
53 |
+
Install required packages from `requirements.txt`:
|
54 |
|
55 |
```bash
|
|
|
|
|
56 |
pip install -r requirements.txt
|
57 |
```
|
58 |
|
59 |
+
#### Run the Application
|
60 |
|
61 |
+
Launch the Gradio interface:
|
62 |
|
63 |
```bash
|
64 |
+
python app.py
|
65 |
```
|
66 |
|
67 |
+
To enable the FastAPI server:
|
68 |
|
69 |
```bash
|
70 |
+
python app.py --enable-fastapi
|
71 |
```
|
72 |
|
73 |
+
#### Access the Application
|
74 |
|
75 |
+
- **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
|
76 |
+
- **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
|
77 |
|
78 |
+
### 2. **Running with Docker**
|
79 |
|
80 |
+
Use Docker for a containerized setup.
|
81 |
|
82 |
+
#### Prerequisites
|
83 |
|
84 |
+
- Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
|
85 |
|
86 |
+
#### Pull the Docker Image
|
87 |
+
|
88 |
+
Pull the pre-built image from Docker Hub:
|
89 |
|
90 |
```bash
|
91 |
+
docker pull JaishnaCodz/objectdetection:latest
|
|
|
92 |
```
|
93 |
|
94 |
+
#### Run the Docker Container
|
95 |
+
|
96 |
+
Run the application on port 8080:
|
97 |
+
|
98 |
+
```bash
|
99 |
+
docker run -d -p 8080:80 JaishnaCodz/objectdetection:latest
|
100 |
+
```
|
101 |
+
|
102 |
+
Access the interface at `http://localhost:8080`.
|
103 |
+
|
104 |
+
#### Build and Run the Docker Image
|
105 |
+
|
106 |
+
To build the Docker image locally:
|
107 |
+
|
108 |
+
1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
|
109 |
+
2. Build the image:
|
110 |
|
111 |
```bash
|
112 |
+
docker build -t objectdetection:local .
|
113 |
```
|
114 |
|
115 |
3. Run the container:
|
116 |
|
117 |
```bash
|
118 |
+
docker run -d -p 8080:80 objectdetection:local
|
119 |
```
|
120 |
|
121 |
+
Access the interface at `http://localhost:8080`.
|
122 |
|
123 |
+
### 3. **Demo**
|
124 |
|
125 |
+
Try the demo on Hugging Face Spaces:
|
|
|
126 |
|
127 |
+
[Object Detection Demo](https://huggingface.co/spaces/JaishnaCodz/ObjectDetection)
|
128 |
|
129 |
+
## Command-Line Arguments
|
130 |
|
131 |
+
The `app.py` script supports the following command-line arguments:
|
132 |
|
133 |
+
- `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
|
134 |
+
- Example: `python app.py --gradio-port 7870`
|
135 |
+
- `--enable-fastapi`: Enable the FastAPI server (disabled by default).
|
136 |
+
- Example: `python app.py --enable-fastapi`
|
137 |
+
- `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
|
138 |
+
- Example: `python app.py --enable-fastapi --fastapi-port 8001`
|
139 |
+
- `--confidence-threshold <float-value)`: Confidence threshold for detection (Range: 0 - 1) (default: 8000).
|
140 |
+
- Example: `python app.py --confidence-threshold 0.75`
|
141 |
|
142 |
+
You can combine arguments:
|
|
|
|
|
143 |
|
144 |
+
```bash
|
145 |
+
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75
|
146 |
+
```
|
147 |
|
148 |
+
Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
|
149 |
+
|
150 |
+
```bash
|
151 |
+
export GRADIO_SERVER_PORT=7870
|
152 |
+
python app.py
|
153 |
```
|
154 |
|
155 |
+
## Using the API
|
156 |
+
|
157 |
+
**Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
|
158 |
+
|
159 |
+
The `/detect` endpoint allows programmatic image processing.
|
160 |
+
|
161 |
+
### Running the FastAPI Server
|
162 |
+
|
163 |
+
Enable FastAPI when launching the script:
|
164 |
+
|
165 |
+
```bash
|
166 |
+
python app.py --enable-fastapi
|
167 |
+
```
|
168 |
+
|
169 |
+
Or run FastAPI separately with Uvicorn:
|
170 |
+
|
171 |
+
```bash
|
172 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
173 |
+
```
|
174 |
+
|
175 |
+
Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
|
176 |
+
|
177 |
+
### Endpoint Details
|
178 |
+
|
179 |
+
- **Endpoint**: `POST /detect`
|
180 |
+
- **Parameters**:
|
181 |
+
- `file`: (optional) Image file (must be `image/*` type).
|
182 |
+
- `image_url`: (optional) URL of the image.
|
183 |
+
- `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
|
184 |
+
- **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
|
185 |
+
|
186 |
+
### Example Requests
|
187 |
+
|
188 |
+
#### Using `curl` with an Image URL
|
189 |
+
|
190 |
+
```bash
|
191 |
+
curl -X POST "http://localhost:8000/detect" \\
|
192 |
+
-H "Content-Type: application/json" \\
|
193 |
+
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
|
194 |
+
```
|
195 |
+
|
196 |
+
#### Using `curl` with an Image File
|
197 |
+
|
198 |
+
```bash
|
199 |
+
curl -X POST "http://localhost:8000/detect" \\
|
200 |
+
-F "file=@/path/to/image.jpg" \\
|
201 |
+
-F "model_name=facebook/detr-resnet-50"
|
202 |
+
```
|
203 |
+
|
204 |
+
### Response Format
|
205 |
+
|
206 |
+
The response includes a base64-encoded image with detections and detection details:
|
207 |
|
208 |
```json
|
209 |
{
|
|
|
215 |
}
|
216 |
```
|
217 |
|
218 |
+
### Notes
|
219 |
+
|
220 |
+
- Ensure only one of `file` or `image_url` is provided.
|
221 |
+
- The API may experience instability with panoptic models; use object detection models for reliability.
|
222 |
+
- Test the API using the Swagger UI for easier debugging.
|
223 |
|
224 |
+
## Development Setup
|
225 |
+
|
226 |
+
To contribute or modify the application:
|
227 |
|
228 |
1. Clone the repository:
|
229 |
|
230 |
```bash
|
231 |
+
git clone https://github.com/JaishnaCodz/ObjectDetection
|
232 |
cd ObjectDetection
|
233 |
```
|
234 |
|
|
|
238 |
pip install -r requirements.txt
|
239 |
```
|
240 |
|
241 |
+
3. Run the application:
|
242 |
|
243 |
+
```bash
|
244 |
+
python app.py
|
245 |
+
```
|
246 |
|
247 |
+
Or run FastAPI:
|
248 |
+
|
249 |
+
```bash
|
250 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
251 |
+
```
|
252 |
+
|
253 |
+
4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
|
254 |
|
255 |
## Contributing
|
256 |
|
257 |
+
Contributions are welcome! To contribute:
|
258 |
+
|
259 |
+
1. Fork the repository.
|
260 |
+
2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
|
261 |
+
3. Commit changes (`git commit -m "Add your feature"`).
|
262 |
+
4. Push to the branch (`git push origin feature/your-feature`).
|
263 |
+
5. Open a pull request on the [GitHub repository](https://github.com/JaishnaCodz/ObjectDetection).
|
264 |
+
|
265 |
+
Please include tests and documentation for new features. Report issues via GitHub Issues.
|
266 |
+
|
267 |
+
## Troubleshooting
|
268 |
|
269 |
+
- **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
|
270 |
+
- Example: `python app.py --gradio-port 7870`
|
271 |
+
- **Colab Asyncio Error**: If you encounter `RuntimeError: asyncio.run() cannot be called from a running event loop` in Colab, the application now uses `nest_asyncio` to handle this. Ensure `nest_asyncio` is installed (`pip install nest_asyncio`).
|
272 |
+
- **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
|
273 |
+
- **API Instability**: Test with smaller images and object detection models first.
|
274 |
+
- **FastAPI Not Starting**: Ensure `--enable-fastapi` is used, and check that the specified `--fastapi-port` (default: 8000) is available.
|
275 |
|
276 |
+
For further assistance, open an issue on the [GitHub repository](https://github.com/JaishnaCodz/ObjectDetection).
|
app.py
CHANGED
@@ -1,88 +1,153 @@
|
|
1 |
-
import
|
2 |
-
import torch
|
3 |
-
from transformers import DetrImageProcessor, DetrForObjectDetection
|
4 |
-
from transformers import YolosImageProcessor, YolosForObjectDetection
|
5 |
-
from transformers import DetrForSegmentation
|
6 |
-
from PIL import Image, ImageDraw, ImageStat
|
7 |
-
import requests
|
8 |
-
from io import BytesIO
|
9 |
import base64
|
10 |
-
from collections import Counter
|
11 |
import logging
|
12 |
-
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
13 |
-
from fastapi.responses import JSONResponse
|
14 |
-
import uvicorn
|
15 |
-
import pandas as pd
|
16 |
-
import traceback
|
17 |
import os
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
#
|
20 |
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
21 |
logger = logging.getLogger(__name__)
|
22 |
|
23 |
-
#
|
24 |
-
CONFIDENCE_THRESHOLD = 0.5
|
25 |
-
VALID_MODELS = [
|
26 |
"facebook/detr-resnet-50",
|
27 |
"facebook/detr-resnet-101",
|
28 |
"facebook/detr-resnet-50-panoptic",
|
29 |
"facebook/detr-resnet-101-panoptic",
|
30 |
"hustvl/yolos-tiny",
|
31 |
-
"hustvl/yolos-base"
|
32 |
]
|
33 |
-
MODEL_DESCRIPTIONS = {
|
34 |
-
"facebook/detr-resnet-50": "DETR with ResNet-50
|
35 |
-
"facebook/detr-resnet-101": "DETR with ResNet-101
|
36 |
-
"facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation.
|
37 |
-
"facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation.
|
38 |
-
"hustvl/yolos-tiny": "YOLOS Tiny
|
39 |
-
"hustvl/yolos-base": "YOLOS Base
|
40 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
-
#
|
43 |
-
|
44 |
-
|
45 |
|
46 |
-
def
|
47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
try:
|
|
|
|
|
|
|
|
|
|
|
49 |
if model_name not in VALID_MODELS:
|
50 |
-
|
51 |
-
|
52 |
-
# Load model and processor
|
53 |
-
if model_name not in models:
|
54 |
-
logger.info(f"Loading model: {model_name}")
|
55 |
-
if "yolos" in model_name:
|
56 |
-
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
57 |
-
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
58 |
-
elif "panoptic" in model_name:
|
59 |
-
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
60 |
-
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
61 |
-
else:
|
62 |
-
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
63 |
-
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
64 |
-
|
65 |
-
model, processor = models[model_name], processors[model_name]
|
66 |
-
inputs = processor(images=image, return_tensors="pt")
|
67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
with torch.no_grad():
|
69 |
outputs = model(**inputs)
|
70 |
|
71 |
-
|
72 |
draw = ImageDraw.Draw(image)
|
73 |
-
object_names = []
|
74 |
-
confidence_scores = []
|
75 |
object_counter = Counter()
|
|
|
76 |
|
|
|
77 |
if "panoptic" in model_name:
|
|
|
78 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
79 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
80 |
-
|
81 |
for segment in results["segments_info"]:
|
82 |
label = segment["label_id"]
|
83 |
label_name = model.config.id2label.get(label, "Unknown")
|
84 |
score = segment.get("score", 1.0)
|
85 |
-
|
86 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
87 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
88 |
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
@@ -93,292 +158,385 @@ def process(image, model_name):
|
|
93 |
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
94 |
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
95 |
draw = ImageDraw.Draw(image)
|
96 |
-
|
97 |
-
if score > CONFIDENCE_THRESHOLD:
|
98 |
object_names.append(label_name)
|
99 |
confidence_scores.append(float(score))
|
100 |
object_counter[label_name] = float(score)
|
101 |
else:
|
|
|
102 |
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
103 |
-
|
104 |
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
105 |
-
if score >
|
106 |
x, y, x2, y2 = box.tolist()
|
107 |
-
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
108 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
109 |
-
# Place text at top-right corner, outside the box, with smaller size
|
110 |
text = f"{label_name}: {score:.2f}"
|
111 |
text_bbox = draw.textbbox((0, 0), text)
|
112 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
113 |
-
|
|
|
|
|
|
|
114 |
object_names.append(label_name)
|
115 |
confidence_scores.append(float(score))
|
116 |
object_counter[label_name] = float(score)
|
117 |
|
|
|
118 |
unique_objects = list(object_counter.keys())
|
119 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
120 |
|
121 |
-
#
|
122 |
-
|
123 |
-
if hasattr(image, "fp") and image.fp is not None:
|
124 |
-
buffered = BytesIO()
|
125 |
-
image.save(buffered, format="PNG")
|
126 |
-
file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
127 |
-
|
128 |
-
# Color statistics
|
129 |
-
try:
|
130 |
-
stat = ImageStat.Stat(image)
|
131 |
-
color_stats = {
|
132 |
-
"mean": [f"{m:.2f}" for m in stat.mean],
|
133 |
-
"stddev": [f"{s:.2f}" for s in stat.stddev]
|
134 |
-
}
|
135 |
-
except Exception as e:
|
136 |
-
logger.error(f"Error calculating color statistics: {str(e)}")
|
137 |
-
color_stats = {"mean": "Error", "stddev": "Error"}
|
138 |
-
|
139 |
-
properties = {
|
140 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
141 |
"Size": f"{image.width}x{image.height}",
|
142 |
"Width": f"{image.width} px",
|
143 |
"Height": f"{image.height} px",
|
144 |
"Mode": image.mode,
|
145 |
-
"Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else
|
146 |
-
"File Size":
|
147 |
-
"Mean (R,G,B)": "
|
148 |
-
"StdDev (R,G,B)": "
|
149 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
150 |
|
151 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
152 |
except Exception as e:
|
153 |
-
|
154 |
-
|
|
|
|
|
155 |
|
|
|
156 |
# FastAPI Setup
|
|
|
|
|
157 |
app = FastAPI(title="Object Detection API")
|
158 |
|
159 |
@app.post("/detect")
|
160 |
async def detect_objects_endpoint(
|
161 |
-
file: UploadFile = File(None),
|
162 |
-
image_url: str = Form(None),
|
163 |
-
model_name: str = Form(VALID_MODELS[0])
|
164 |
-
)
|
165 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
try:
|
|
|
167 |
if (file is None and not image_url) or (file is not None and image_url):
|
168 |
-
raise HTTPException(status_code=400, detail="Provide either an image file or an image URL,
|
169 |
-
|
|
|
|
|
|
|
|
|
170 |
if file:
|
171 |
if not file.content_type.startswith("image/"):
|
172 |
raise HTTPException(status_code=400, detail="File must be an image")
|
173 |
contents = await file.read()
|
174 |
image = Image.open(BytesIO(contents)).convert("RGB")
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
|
179 |
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
|
184 |
-
|
185 |
-
buffered = BytesIO()
|
186 |
-
detected_image.save(buffered, format="PNG")
|
187 |
-
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
188 |
-
img_url = f"data:image/png;base64,{img_base64}"
|
189 |
-
|
190 |
-
return JSONResponse(content={
|
191 |
-
"image_url": img_url,
|
192 |
-
"detected_objects": detected_objects,
|
193 |
-
"confidence_scores": detected_confidences,
|
194 |
-
"unique_objects": unique_objects,
|
195 |
-
"unique_confidence_scores": unique_confidences
|
196 |
-
})
|
197 |
except Exception as e:
|
198 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
199 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
200 |
|
201 |
-
#
|
202 |
-
|
203 |
-
|
204 |
-
|
205 |
-
|
206 |
-
|
207 |
-
|
208 |
-
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
|
224 |
-
|
225 |
-
|
226 |
-
|
227 |
-
|
228 |
-
|
229 |
-
|
230 |
-
|
231 |
-
|
232 |
-
|
233 |
-
|
234 |
-
|
235 |
-
|
236 |
-
|
237 |
-
|
238 |
-
|
239 |
-
|
240 |
-
|
241 |
-
|
242 |
-
|
243 |
-
|
244 |
-
|
245 |
-
|
246 |
-
|
247 |
-
|
248 |
-
|
249 |
-
|
250 |
-
|
251 |
-
|
252 |
-
interactive=False
|
253 |
-
)
|
254 |
-
with gr.Row():
|
255 |
-
objects_output = gr.DataFrame(
|
256 |
-
label="📋 Detected Objects",
|
257 |
-
interactive=False,
|
258 |
-
value=None
|
259 |
)
|
260 |
-
|
261 |
-
|
262 |
-
|
263 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
264 |
)
|
265 |
-
|
266 |
-
|
267 |
-
|
268 |
-
|
269 |
-
|
270 |
-
|
271 |
-
|
272 |
-
|
273 |
-
|
274 |
-
|
275 |
-
|
276 |
-
|
277 |
-
|
278 |
-
|
279 |
-
|
280 |
-
|
281 |
-
|
282 |
-
|
283 |
-
|
284 |
-
|
285 |
-
|
286 |
-
|
287 |
-
|
288 |
-
|
289 |
-
|
290 |
-
|
291 |
-
|
292 |
-
|
293 |
-
|
294 |
-
|
295 |
-
|
296 |
-
|
297 |
-
|
298 |
-
|
299 |
-
|
300 |
-
|
301 |
-
|
302 |
-
|
303 |
-
|
304 |
-
|
305 |
-
|
306 |
-
|
307 |
-
|
308 |
-
|
309 |
-
|
310 |
-
|
311 |
-
|
312 |
-
|
313 |
-
|
314 |
-
|
315 |
-
|
316 |
-
|
317 |
-
|
318 |
-
|
319 |
-
|
320 |
-
|
321 |
-
|
322 |
-
|
323 |
-
|
324 |
-
|
325 |
-
|
326 |
-
|
327 |
-
|
328 |
-
|
329 |
-
|
330 |
-
|
331 |
-
|
332 |
-
|
333 |
-
|
334 |
-
|
335 |
-
|
336 |
-
|
337 |
-
|
338 |
-
|
339 |
-
|
340 |
-
|
341 |
-
|
342 |
-
|
343 |
-
|
344 |
-
|
345 |
-
|
346 |
-
|
347 |
-
|
348 |
-
|
349 |
-
|
350 |
-
|
351 |
-
|
352 |
-
|
353 |
-
|
354 |
-
|
355 |
-
|
356 |
-
|
357 |
-
|
358 |
-
|
359 |
-
|
360 |
-
|
361 |
-
|
362 |
-
|
363 |
-
|
364 |
-
|
365 |
-
|
366 |
-
|
367 |
-
|
368 |
-
|
369 |
-
|
370 |
-
|
371 |
-
|
372 |
-
|
373 |
-
|
374 |
-
|
375 |
-
|
376 |
-
|
377 |
-
|
378 |
-
|
379 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
380 |
|
381 |
if __name__ == "__main__":
|
382 |
-
|
383 |
-
demo.launch()
|
384 |
-
# To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
|
|
|
1 |
+
import argparse
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import base64
|
|
|
3 |
import logging
|
|
|
|
|
|
|
|
|
|
|
4 |
import os
|
5 |
+
import sys
|
6 |
+
import threading
|
7 |
+
from collections import Counter
|
8 |
+
from io import BytesIO
|
9 |
+
from typing import Dict, List, Optional, Tuple, Union
|
10 |
+
|
11 |
+
import gradio as gr
|
12 |
+
import pandas as pd
|
13 |
+
import requests
|
14 |
+
import torch
|
15 |
+
import uvicorn
|
16 |
+
from fastapi import FastAPI, File, Form, HTTPException, UploadFile
|
17 |
+
from fastapi.responses import JSONResponse
|
18 |
+
from PIL import Image, ImageDraw, ImageStat
|
19 |
+
from transformers import (
|
20 |
+
DetrForObjectDetection,
|
21 |
+
DetrForSegmentation,
|
22 |
+
DetrImageProcessor,
|
23 |
+
YolosForObjectDetection,
|
24 |
+
YolosImageProcessor,
|
25 |
+
)
|
26 |
+
import nest_asyncio
|
27 |
+
|
28 |
+
# ------------------------------
|
29 |
+
# Configuration
|
30 |
+
# ------------------------------
|
31 |
|
32 |
+
# Configure logging for debugging and monitoring
|
33 |
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
34 |
logger = logging.getLogger(__name__)
|
35 |
|
36 |
+
# Define constants for model and server configuration
|
37 |
+
CONFIDENCE_THRESHOLD: float = 0.5 # Default threshold for object detection confidence
|
38 |
+
VALID_MODELS: List[str] = [
|
39 |
"facebook/detr-resnet-50",
|
40 |
"facebook/detr-resnet-101",
|
41 |
"facebook/detr-resnet-50-panoptic",
|
42 |
"facebook/detr-resnet-101-panoptic",
|
43 |
"hustvl/yolos-tiny",
|
44 |
+
"hustvl/yolos-base",
|
45 |
]
|
46 |
+
MODEL_DESCRIPTIONS: Dict[str, str] = {
|
47 |
+
"facebook/detr-resnet-50": "DETR with ResNet-50 for object detection. Fast and accurate.",
|
48 |
+
"facebook/detr-resnet-101": "DETR with ResNet-101 for object detection. More accurate, slower.",
|
49 |
+
"facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation.",
|
50 |
+
"facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation.",
|
51 |
+
"hustvl/yolos-tiny": "YOLOS Tiny. Lightweight and fast.",
|
52 |
+
"hustvl/yolos-base": "YOLOS Base. Balances speed and accuracy."
|
53 |
}
|
54 |
+
DEFAULT_GRADIO_PORT: int = 7860 # Default port for Gradio UI
|
55 |
+
DEFAULT_FASTAPI_PORT: int = 8000 # Default port for FastAPI server
|
56 |
+
PORT_RANGE: range = range(7860, 7870) # Range of ports to try for Gradio
|
57 |
+
MAX_PORT_ATTEMPTS: int = 10 # Maximum attempts to find an available port
|
58 |
+
|
59 |
+
# Thread-safe storage for lazy-loaded models and processors
|
60 |
+
models: Dict[str, any] = {}
|
61 |
+
processors: Dict[str, any] = {}
|
62 |
+
model_lock = threading.Lock()
|
63 |
|
64 |
+
# ------------------------------
|
65 |
+
# Image Processing
|
66 |
+
# ------------------------------
|
67 |
|
68 |
+
def process_image(
|
69 |
+
image: Optional[Image.Image],
|
70 |
+
url: Optional[str],
|
71 |
+
model_name: str,
|
72 |
+
for_json: bool = False,
|
73 |
+
confidence_threshold: float = CONFIDENCE_THRESHOLD
|
74 |
+
) -> Union[Dict, Tuple[Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str]]:
|
75 |
+
"""
|
76 |
+
Process an image for object detection or panoptic segmentation, handling Gradio and FastAPI inputs.
|
77 |
+
|
78 |
+
Args:
|
79 |
+
image: PIL Image object from file upload (optional).
|
80 |
+
url: URL of the image to process (optional).
|
81 |
+
model_name: Name of the model to use (must be in VALID_MODELS).
|
82 |
+
for_json: If True, return JSON dict for API/JSON tab; else, return tuple for Gradio Home tab.
|
83 |
+
confidence_threshold: Minimum confidence score for detection (default: 0.5).
|
84 |
+
|
85 |
+
Returns:
|
86 |
+
For JSON: Dict with base64-encoded image, detected objects, and confidence scores.
|
87 |
+
For Gradio: Tuple of (annotated image, objects DataFrame, unique objects DataFrame, properties DataFrame, error message).
|
88 |
+
"""
|
89 |
try:
|
90 |
+
# Validate input: ensure exactly one of image or URL is provided
|
91 |
+
if image is None and not url:
|
92 |
+
return {"error": "Please provide an image or URL"} if for_json else (None, None, None, None, "Please provide an image or URL")
|
93 |
+
if image and url:
|
94 |
+
return {"error": "Provide either an image or URL, not both"} if for_json else (None, None, None, None, "Provide either an image or URL, not both")
|
95 |
if model_name not in VALID_MODELS:
|
96 |
+
error_msg = f"Invalid model: {model_name}. Choose from: {VALID_MODELS}"
|
97 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
|
99 |
+
# Calculate margin threshold: (1 - confidence_threshold) / 2 + confidence_threshold
|
100 |
+
margin_threshold = (1 - confidence_threshold) / 2 + confidence_threshold
|
101 |
+
|
102 |
+
# Load image from URL if provided
|
103 |
+
if url:
|
104 |
+
response = requests.get(url, timeout=10)
|
105 |
+
response.raise_for_status()
|
106 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
107 |
+
|
108 |
+
# Load model and processor thread-safely
|
109 |
+
with model_lock:
|
110 |
+
if model_name not in models:
|
111 |
+
logger.info(f"Loading model: {model_name}")
|
112 |
+
try:
|
113 |
+
# Select appropriate model and processor based on model name
|
114 |
+
if "yolos" in model_name:
|
115 |
+
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
116 |
+
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
117 |
+
elif "panoptic" in model_name:
|
118 |
+
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
119 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
120 |
+
else:
|
121 |
+
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
122 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
123 |
+
except Exception as e:
|
124 |
+
error_msg = f"Failed to load model: {str(e)}"
|
125 |
+
logger.error(error_msg)
|
126 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
127 |
+
model, processor = models[model_name], processors[model_name]
|
128 |
+
|
129 |
+
# Prepare image for model processing
|
130 |
+
inputs = processor(images=image, return_tensors="pt")
|
131 |
with torch.no_grad():
|
132 |
outputs = model(**inputs)
|
133 |
|
134 |
+
# Initialize drawing context for annotations
|
135 |
draw = ImageDraw.Draw(image)
|
136 |
+
object_names: List[str] = []
|
137 |
+
confidence_scores: List[float] = []
|
138 |
object_counter = Counter()
|
139 |
+
target_sizes = torch.tensor([image.size[::-1]])
|
140 |
|
141 |
+
# Process results based on model type (panoptic or object detection)
|
142 |
if "panoptic" in model_name:
|
143 |
+
# Handle panoptic segmentation
|
144 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
145 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
|
|
146 |
for segment in results["segments_info"]:
|
147 |
label = segment["label_id"]
|
148 |
label_name = model.config.id2label.get(label, "Unknown")
|
149 |
score = segment.get("score", 1.0)
|
150 |
+
# Apply segmentation mask if available
|
151 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
152 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
153 |
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
|
|
158 |
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
159 |
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
160 |
draw = ImageDraw.Draw(image)
|
161 |
+
if score > confidence_threshold:
|
|
|
162 |
object_names.append(label_name)
|
163 |
confidence_scores.append(float(score))
|
164 |
object_counter[label_name] = float(score)
|
165 |
else:
|
166 |
+
# Handle object detection
|
167 |
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
|
|
168 |
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
169 |
+
if score > confidence_threshold:
|
170 |
x, y, x2, y2 = box.tolist()
|
|
|
171 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
|
|
172 |
text = f"{label_name}: {score:.2f}"
|
173 |
text_bbox = draw.textbbox((0, 0), text)
|
174 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
175 |
+
# Use yellow for confidence_threshold <= score < margin_threshold, green for >= margin_threshold
|
176 |
+
color = "#FFFF00" if score < margin_threshold else "#32CD32"
|
177 |
+
draw.rectangle([x, y, x2, y2], outline=color, width=2)
|
178 |
+
draw.text((x2 - text_width - 2, y - text_height - 2), text, fill=color)
|
179 |
object_names.append(label_name)
|
180 |
confidence_scores.append(float(score))
|
181 |
object_counter[label_name] = float(score)
|
182 |
|
183 |
+
# Compile unique objects and their highest confidence scores
|
184 |
unique_objects = list(object_counter.keys())
|
185 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
186 |
|
187 |
+
# Calculate image properties (metadata)
|
188 |
+
properties: Dict[str, str] = {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
189 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
190 |
"Size": f"{image.width}x{image.height}",
|
191 |
"Width": f"{image.width} px",
|
192 |
"Height": f"{image.height} px",
|
193 |
"Mode": image.mode,
|
194 |
+
"Aspect Ratio": f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined",
|
195 |
+
"File Size": "Unknown",
|
196 |
+
"Mean (R,G,B)": "Unknown",
|
197 |
+
"StdDev (R,G,B)": "Unknown",
|
198 |
}
|
199 |
+
try:
|
200 |
+
# Compute file size
|
201 |
+
buffered = BytesIO()
|
202 |
+
image.save(buffered, format="PNG")
|
203 |
+
properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
204 |
+
# Compute color statistics
|
205 |
+
stat = ImageStat.Stat(image)
|
206 |
+
properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
|
207 |
+
properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
|
208 |
+
except Exception as e:
|
209 |
+
logger.error(f"Error calculating image stats: {str(e)}")
|
210 |
|
211 |
+
# Prepare output based on request type
|
212 |
+
if for_json:
|
213 |
+
# Return JSON with base64-encoded image
|
214 |
+
buffered = BytesIO()
|
215 |
+
image.save(buffered, format="PNG")
|
216 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
217 |
+
return {
|
218 |
+
"image_url": f"data:image/png;base64,{img_base64}",
|
219 |
+
"detected_objects": object_names,
|
220 |
+
"confidence_scores": confidence_scores,
|
221 |
+
"unique_objects": unique_objects,
|
222 |
+
"unique_confidence_scores": unique_confidences,
|
223 |
+
}
|
224 |
+
else:
|
225 |
+
# Return tuple for Gradio Home tab with DataFrames
|
226 |
+
objects_df = (
|
227 |
+
pd.DataFrame({"Object": object_names, "Confidence Score": [f"{score:.2f}" for score in confidence_scores]})
|
228 |
+
if object_names else pd.DataFrame(columns=["Object", "Confidence Score"])
|
229 |
+
)
|
230 |
+
unique_objects_df = (
|
231 |
+
pd.DataFrame({"Unique Object": unique_objects, "Confidence Score": [f"{score:.2f}" for score in unique_confidences]})
|
232 |
+
if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
233 |
+
)
|
234 |
+
properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
|
235 |
+
return image, objects_df, unique_objects_df, properties_df, ""
|
236 |
+
|
237 |
+
except requests.RequestException as e:
|
238 |
+
# Handle URL fetch errors
|
239 |
+
error_msg = f"Error fetching image from URL: {str(e)}"
|
240 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
241 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
242 |
except Exception as e:
|
243 |
+
# Handle general processing errors
|
244 |
+
error_msg = f"Error processing image: {str(e)}"
|
245 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
246 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
247 |
|
248 |
+
# ------------------------------
|
249 |
# FastAPI Setup
|
250 |
+
# ------------------------------
|
251 |
+
|
252 |
app = FastAPI(title="Object Detection API")
|
253 |
|
254 |
@app.post("/detect")
|
255 |
async def detect_objects_endpoint(
|
256 |
+
file: Optional[UploadFile] = File(None),
|
257 |
+
image_url: Optional[str] = Form(None),
|
258 |
+
model_name: str = Form(VALID_MODELS[0]),
|
259 |
+
confidence_threshold: float = Form(CONFIDENCE_THRESHOLD),
|
260 |
+
) -> JSONResponse:
|
261 |
+
"""
|
262 |
+
FastAPI endpoint to detect objects in an image from file upload or URL.
|
263 |
+
|
264 |
+
Args:
|
265 |
+
file: Uploaded image file (optional).
|
266 |
+
image_url: URL of the image (optional).
|
267 |
+
model_name: Model to use for detection (default: first VALID_MODELS entry).
|
268 |
+
confidence_threshold: Confidence threshold for detection (default: 0.5).
|
269 |
+
|
270 |
+
Returns:
|
271 |
+
JSONResponse with base64-encoded image, detected objects, and confidence scores.
|
272 |
+
|
273 |
+
Raises:
|
274 |
+
HTTPException: For invalid inputs or processing errors.
|
275 |
+
"""
|
276 |
try:
|
277 |
+
# Validate input: ensure exactly one of file or URL
|
278 |
if (file is None and not image_url) or (file is not None and image_url):
|
279 |
+
raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, not both.")
|
280 |
+
# Validate confidence threshold
|
281 |
+
if not 0 <= confidence_threshold <= 1:
|
282 |
+
raise HTTPException(status_code=400, detail="Confidence threshold must be between 0 and 1.")
|
283 |
+
# Load image from file if provided
|
284 |
+
image = None
|
285 |
if file:
|
286 |
if not file.content_type.startswith("image/"):
|
287 |
raise HTTPException(status_code=400, detail="File must be an image")
|
288 |
contents = await file.read()
|
289 |
image = Image.open(BytesIO(contents)).convert("RGB")
|
290 |
+
# Process image with specified parameters
|
291 |
+
result = process_image(image, image_url, model_name, for_json=True, confidence_threshold=confidence_threshold)
|
292 |
+
if "error" in result:
|
293 |
+
raise HTTPException(status_code=400, detail=result["error"])
|
294 |
+
return JSONResponse(content=result)
|
295 |
+
except HTTPException:
|
296 |
+
raise
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
297 |
except Exception as e:
|
298 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
299 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
300 |
|
301 |
+
# ------------------------------
|
302 |
+
# Gradio UI Setup
|
303 |
+
# ------------------------------
|
304 |
+
|
305 |
+
def create_gradio_ui() -> gr.Blocks:
|
306 |
+
"""
|
307 |
+
Create and configure the Gradio UI for object detection with Home, JSON, and Help tabs.
|
308 |
+
|
309 |
+
Returns:
|
310 |
+
Gradio Blocks object representing the UI.
|
311 |
+
|
312 |
+
Raises:
|
313 |
+
RuntimeError: If UI creation fails.
|
314 |
+
"""
|
315 |
+
try:
|
316 |
+
# Initialize Gradio Blocks with a custom theme
|
317 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
|
318 |
+
# Display app header
|
319 |
+
gr.Markdown(
|
320 |
+
f"""
|
321 |
+
# 🚀 Object Detection App
|
322 |
+
Upload an image or provide a URL to detect objects using transformer models (DETR, YOLOS).
|
323 |
+
Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
|
324 |
+
"""
|
325 |
+
)
|
326 |
+
|
327 |
+
# Create tabbed interface
|
328 |
+
with gr.Tabs():
|
329 |
+
# Home tab (formerly Image Upload)
|
330 |
+
with gr.Tab("🏠 Home"):
|
331 |
+
with gr.Row():
|
332 |
+
# Left column for inputs
|
333 |
+
with gr.Column(scale=1):
|
334 |
+
gr.Markdown("### Input")
|
335 |
+
# Model selection dropdown
|
336 |
+
model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
|
337 |
+
model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
|
338 |
+
# Image upload input
|
339 |
+
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
340 |
+
# Image URL input
|
341 |
+
image_url_input = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
|
342 |
+
# Buttons for submission and clearing
|
343 |
+
with gr.Row():
|
344 |
+
submit_btn = gr.Button("✨ Detect", variant="primary")
|
345 |
+
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
346 |
+
|
347 |
+
# Update model info when model changes
|
348 |
+
model_choice.change(
|
349 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
350 |
+
inputs=model_choice,
|
351 |
+
outputs=model_info,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
352 |
)
|
353 |
+
|
354 |
+
# Right column for results
|
355 |
+
with gr.Column(scale=2):
|
356 |
+
gr.Markdown("### Results")
|
357 |
+
# Error display (hidden by default)
|
358 |
+
error_output = gr.Textbox(label="⚠️ Errors", visible=False, lines=3, max_lines=5)
|
359 |
+
# Annotated image output
|
360 |
+
output_image = gr.Image(type="pil", label="🎯 Detected Image", interactive=False)
|
361 |
+
# Detected and unique objects tables
|
362 |
+
with gr.Row():
|
363 |
+
objects_output = gr.DataFrame(label="📋 Detected Objects", interactive=False)
|
364 |
+
unique_objects_output = gr.DataFrame(label="🔍 Unique Objects", interactive=False)
|
365 |
+
# Image properties table
|
366 |
+
properties_output = gr.DataFrame(label="📄 Image Properties", interactive=False)
|
367 |
+
|
368 |
+
# Process image when Detect button is clicked
|
369 |
+
submit_btn.click(
|
370 |
+
fn=process_image,
|
371 |
+
inputs=[image_input, image_url_input, model_choice],
|
372 |
+
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
|
373 |
+
)
|
374 |
+
|
375 |
+
# Clear all inputs and outputs
|
376 |
+
clear_btn.click(
|
377 |
+
fn=lambda: [None, "", None, None, None, None],
|
378 |
+
inputs=None,
|
379 |
+
outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output],
|
380 |
+
)
|
381 |
+
|
382 |
+
# JSON tab for API-like output
|
383 |
+
with gr.Tab("🔗 JSON"):
|
384 |
+
with gr.Row():
|
385 |
+
# Left column for inputs
|
386 |
+
with gr.Column(scale=1):
|
387 |
+
gr.Markdown("### Process Image for JSON")
|
388 |
+
# Model selection dropdown
|
389 |
+
url_model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
|
390 |
+
url_model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
|
391 |
+
# Image upload input
|
392 |
+
image_input_json = gr.Image(type="pil", label="📷 Upload Image")
|
393 |
+
# Image URL input
|
394 |
+
image_url_input_json = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
|
395 |
+
# Process button
|
396 |
+
url_submit_btn = gr.Button("🔄 Process", variant="primary")
|
397 |
+
|
398 |
+
# Update model info when model changes
|
399 |
+
url_model_choice.change(
|
400 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
401 |
+
inputs=url_model_choice,
|
402 |
+
outputs=url_model_info,
|
403 |
)
|
404 |
+
|
405 |
+
# Right column for JSON output
|
406 |
+
with gr.Column(scale=1):
|
407 |
+
# JSON output display
|
408 |
+
url_output = gr.JSON(label="API Response")
|
409 |
+
|
410 |
+
# Process image and return JSON when Process button is clicked
|
411 |
+
url_submit_btn.click(
|
412 |
+
fn=lambda img, url, model: process_image(img, url, model, for_json=True),
|
413 |
+
inputs=[image_input_json, image_url_input_json, url_model_choice],
|
414 |
+
outputs=[url_output],
|
415 |
+
)
|
416 |
+
|
417 |
+
# Help tab with usage instructions
|
418 |
+
with gr.Tab("ℹ️ Help"):
|
419 |
+
gr.Markdown(
|
420 |
+
"""
|
421 |
+
## How to Use
|
422 |
+
- **Home**: Select a model, upload an image or provide a URL, click "Detect" to see results.
|
423 |
+
- **JSON**: Select a model, upload an image or enter a URL, click "Process" for JSON output.
|
424 |
+
- **Models**: Choose DETR (detection or panoptic) or YOLOS (lightweight detection).
|
425 |
+
- **Clear**: Reset inputs/outputs in Home tab.
|
426 |
+
- **Errors**: Check error box (Home) or JSON response (JSON) for issues.
|
427 |
+
|
428 |
+
## Tips
|
429 |
+
- Use high-quality images for better results.
|
430 |
+
- Panoptic models provide segmentation masks for complex scenes.
|
431 |
+
- YOLOS-Tiny is faster for resource-constrained devices.
|
432 |
+
"""
|
433 |
+
)
|
434 |
+
|
435 |
+
return demo
|
436 |
+
|
437 |
+
except Exception as e:
|
438 |
+
logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
|
439 |
+
raise RuntimeError(f"Failed to create Gradio UI: {str(e)}")
|
440 |
+
|
441 |
+
# ------------------------------
|
442 |
+
# Launcher
|
443 |
+
# ------------------------------
|
444 |
+
|
445 |
+
def parse_args() -> argparse.Namespace:
|
446 |
+
"""
|
447 |
+
Parse command-line arguments for configuring the application.
|
448 |
+
|
449 |
+
Returns:
|
450 |
+
Parsed arguments as a Namespace object.
|
451 |
+
"""
|
452 |
+
parser = argparse.ArgumentParser(description="Object Detection App with Gradio and FastAPI.")
|
453 |
+
# Gradio port argument
|
454 |
+
parser.add_argument("--gradio-port", type=int, default=DEFAULT_GRADIO_PORT, help=f"Gradio port (default: {DEFAULT_GRADIO_PORT}).")
|
455 |
+
# FastAPI enable flag
|
456 |
+
parser.add_argument("--enable-fastapi", action="store_true", help="Enable FastAPI server.")
|
457 |
+
# FastAPI port argument
|
458 |
+
parser.add_argument("--fastapi-port", type=int, default=DEFAULT_FASTAPI_PORT, help=f"FastAPI port (default: {DEFAULT_FASTAPI_PORT}).")
|
459 |
+
# Confidence threshold argument
|
460 |
+
parser.add_argument("--confidence-threshold", type=float, default=CONFIDENCE_THRESHOLD, help="Confidence threshold for detection (default: 0.5).")
|
461 |
+
# Parse known arguments, ignoring unrecognized ones
|
462 |
+
args, _ = parser.parse_known_args()
|
463 |
+
# Validate confidence threshold
|
464 |
+
if not 0 <= args.confidence_threshold <= 1:
|
465 |
+
parser.error("Confidence threshold must be between 0 and 1.")
|
466 |
+
return args
|
467 |
+
|
468 |
+
def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
|
469 |
+
"""
|
470 |
+
Find an available port within the specified range.
|
471 |
+
|
472 |
+
Args:
|
473 |
+
start_port: Initial port to try.
|
474 |
+
port_range: Range of ports to attempt.
|
475 |
+
max_attempts: Maximum number of ports to try.
|
476 |
+
|
477 |
+
Returns:
|
478 |
+
Available port number, or None if no port is found.
|
479 |
+
"""
|
480 |
+
import socket
|
481 |
+
# Check environment variable for port override
|
482 |
+
port = int(os.getenv("GRADIO_SERVER_PORT", start_port))
|
483 |
+
attempts = 0
|
484 |
+
while attempts < max_attempts:
|
485 |
+
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
486 |
+
try:
|
487 |
+
# Attempt to bind to the port
|
488 |
+
s.bind(("0.0.0.0", port))
|
489 |
+
logger.debug(f"Port {port} is available")
|
490 |
+
return port
|
491 |
+
except OSError as e:
|
492 |
+
if e.errno == 98: # Port in use
|
493 |
+
logger.debug(f"Port {port} is in use")
|
494 |
+
port = port + 1 if port < max(port_range) else min(port_range)
|
495 |
+
attempts += 1
|
496 |
+
else:
|
497 |
+
raise
|
498 |
+
logger.error(f"No available port in range {min(port_range)}-{max(port_range)}")
|
499 |
+
return None
|
500 |
+
|
501 |
+
def main() -> None:
|
502 |
+
"""
|
503 |
+
Launch the Gradio UI and optional FastAPI server.
|
504 |
+
|
505 |
+
Raises:
|
506 |
+
SystemExit: On interruption or critical errors.
|
507 |
+
"""
|
508 |
+
try:
|
509 |
+
# Apply nest_asyncio for compatibility with Jupyter/Colab
|
510 |
+
nest_asyncio.apply()
|
511 |
+
# Parse command-line arguments
|
512 |
+
args = parse_args()
|
513 |
+
logger.info(f"Parsed arguments: {args}")
|
514 |
+
# Find available port for Gradio
|
515 |
+
gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
|
516 |
+
if gradio_port is None:
|
517 |
+
logger.error("Failed to find an available port for Gradio UI")
|
518 |
+
sys.exit(1)
|
519 |
+
|
520 |
+
# Start FastAPI server in a thread if enabled
|
521 |
+
if args.enable_fastapi:
|
522 |
+
logger.info(f"Starting FastAPI on port {args.fastapi_port}")
|
523 |
+
fastapi_thread = threading.Thread(
|
524 |
+
target=lambda: uvicorn.run(app, host="0.0.0.0", port=args.fastapi_port),
|
525 |
+
daemon=True
|
526 |
+
)
|
527 |
+
fastapi_thread.start()
|
528 |
+
|
529 |
+
# Launch Gradio UI
|
530 |
+
logger.info(f"Starting Gradio UI on port {gradio_port}")
|
531 |
+
demo = create_gradio_ui()
|
532 |
+
demo.launch(server_port=gradio_port, server_name="0.0.0.0")
|
533 |
+
|
534 |
+
except KeyboardInterrupt:
|
535 |
+
logger.info("Application terminated by user.")
|
536 |
+
sys.exit(0)
|
537 |
+
except Exception as e:
|
538 |
+
logger.error(f"Error: {str(e)}\n{traceback.format_exc()}")
|
539 |
+
sys.exit(1)
|
540 |
|
541 |
if __name__ == "__main__":
|
542 |
+
main()
|
|
|
|
hf_space/README.md
CHANGED
@@ -1,13 +1,276 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 🚀 Object Detection with Transformer Models
|
2 |
+
|
3 |
+
This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
|
4 |
+
|
5 |
+
Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/JaishnaCodz/ObjectDetection).
|
6 |
+
|
7 |
+
## Models Supported
|
8 |
+
|
9 |
+
The application supports the following models, each tailored for specific detection or segmentation tasks:
|
10 |
+
|
11 |
+
- **DETR (DEtection TRansformer)**:
|
12 |
+
- `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
|
13 |
+
- `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
|
14 |
+
- `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
|
15 |
+
- `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
|
16 |
+
|
17 |
+
- **YOLOS (You Only Look One-level Series)**:
|
18 |
+
- `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
|
19 |
+
- `hustvl/yolos-base`: Balances speed and accuracy for object detection.
|
20 |
+
|
21 |
+
## Features
|
22 |
+
|
23 |
+
- **Image Upload**: Upload images via the Gradio interface for object detection.
|
24 |
+
- **URL Input**: Provide image URLs for detection through the Gradio interface or API.
|
25 |
+
- **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
|
26 |
+
- **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
|
27 |
+
- **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
|
28 |
+
- **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
|
29 |
+
- **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
|
30 |
+
- **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
|
31 |
+
|
32 |
+
## How to Use
|
33 |
+
|
34 |
+
### 1. **Local Setup (Git Clone)**
|
35 |
+
|
36 |
+
Follow these steps to set up the application locally:
|
37 |
+
|
38 |
+
#### Prerequisites
|
39 |
+
|
40 |
+
- Python 3.8 or higher
|
41 |
+
- `pip` for installing dependencies
|
42 |
+
- Git for cloning the repository
|
43 |
+
|
44 |
+
#### Clone the Repository
|
45 |
+
|
46 |
+
```bash
|
47 |
+
git clone https://github.com/JaishnaCodz/ObjectDetection
|
48 |
+
cd ObjectDetection
|
49 |
+
```
|
50 |
+
|
51 |
+
#### Install Dependencies
|
52 |
+
|
53 |
+
Install required packages from `requirements.txt`:
|
54 |
+
|
55 |
+
```bash
|
56 |
+
pip install -r requirements.txt
|
57 |
+
```
|
58 |
+
|
59 |
+
#### Run the Application
|
60 |
+
|
61 |
+
Launch the Gradio interface:
|
62 |
+
|
63 |
+
```bash
|
64 |
+
python app.py
|
65 |
+
```
|
66 |
+
|
67 |
+
To enable the FastAPI server:
|
68 |
+
|
69 |
+
```bash
|
70 |
+
python app.py --enable-fastapi
|
71 |
+
```
|
72 |
+
|
73 |
+
#### Access the Application
|
74 |
+
|
75 |
+
- **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
|
76 |
+
- **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
|
77 |
+
|
78 |
+
### 2. **Running with Docker**
|
79 |
+
|
80 |
+
Use Docker for a containerized setup.
|
81 |
+
|
82 |
+
#### Prerequisites
|
83 |
+
|
84 |
+
- Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
|
85 |
+
|
86 |
+
#### Pull the Docker Image
|
87 |
+
|
88 |
+
Pull the pre-built image from Docker Hub:
|
89 |
+
|
90 |
+
```bash
|
91 |
+
docker pull JaishnaCodz/objectdetection:latest
|
92 |
+
```
|
93 |
+
|
94 |
+
#### Run the Docker Container
|
95 |
+
|
96 |
+
Run the application on port 8080:
|
97 |
+
|
98 |
+
```bash
|
99 |
+
docker run -d -p 8080:80 JaishnaCodz/objectdetection:latest
|
100 |
+
```
|
101 |
+
|
102 |
+
Access the interface at `http://localhost:8080`.
|
103 |
+
|
104 |
+
#### Build and Run the Docker Image
|
105 |
+
|
106 |
+
To build the Docker image locally:
|
107 |
+
|
108 |
+
1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
|
109 |
+
2. Build the image:
|
110 |
+
|
111 |
+
```bash
|
112 |
+
docker build -t objectdetection:local .
|
113 |
+
```
|
114 |
+
|
115 |
+
3. Run the container:
|
116 |
+
|
117 |
+
```bash
|
118 |
+
docker run -d -p 8080:80 objectdetection:local
|
119 |
+
```
|
120 |
+
|
121 |
+
Access the interface at `http://localhost:8080`.
|
122 |
+
|
123 |
+
### 3. **Demo**
|
124 |
+
|
125 |
+
Try the demo on Hugging Face Spaces:
|
126 |
+
|
127 |
+
[Object Detection Demo](https://huggingface.co/spaces/JaishnaCodz/ObjectDetection)
|
128 |
+
|
129 |
+
## Command-Line Arguments
|
130 |
+
|
131 |
+
The `app.py` script supports the following command-line arguments:
|
132 |
+
|
133 |
+
- `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
|
134 |
+
- Example: `python app.py --gradio-port 7870`
|
135 |
+
- `--enable-fastapi`: Enable the FastAPI server (disabled by default).
|
136 |
+
- Example: `python app.py --enable-fastapi`
|
137 |
+
- `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
|
138 |
+
- Example: `python app.py --enable-fastapi --fastapi-port 8001`
|
139 |
+
- `--confidence-threshold <float-value)`: Confidence threshold for detection (Range: 0 - 1) (default: 8000).
|
140 |
+
- Example: `python app.py --confidence-threshold 0.75`
|
141 |
+
|
142 |
+
You can combine arguments:
|
143 |
+
|
144 |
+
```bash
|
145 |
+
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75
|
146 |
+
```
|
147 |
+
|
148 |
+
Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
|
149 |
+
|
150 |
+
```bash
|
151 |
+
export GRADIO_SERVER_PORT=7870
|
152 |
+
python app.py
|
153 |
+
```
|
154 |
+
|
155 |
+
## Using the API
|
156 |
+
|
157 |
+
**Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
|
158 |
+
|
159 |
+
The `/detect` endpoint allows programmatic image processing.
|
160 |
+
|
161 |
+
### Running the FastAPI Server
|
162 |
+
|
163 |
+
Enable FastAPI when launching the script:
|
164 |
+
|
165 |
+
```bash
|
166 |
+
python app.py --enable-fastapi
|
167 |
+
```
|
168 |
+
|
169 |
+
Or run FastAPI separately with Uvicorn:
|
170 |
+
|
171 |
+
```bash
|
172 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
173 |
+
```
|
174 |
+
|
175 |
+
Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
|
176 |
+
|
177 |
+
### Endpoint Details
|
178 |
+
|
179 |
+
- **Endpoint**: `POST /detect`
|
180 |
+
- **Parameters**:
|
181 |
+
- `file`: (optional) Image file (must be `image/*` type).
|
182 |
+
- `image_url`: (optional) URL of the image.
|
183 |
+
- `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
|
184 |
+
- **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
|
185 |
+
|
186 |
+
### Example Requests
|
187 |
+
|
188 |
+
#### Using `curl` with an Image URL
|
189 |
+
|
190 |
+
```bash
|
191 |
+
curl -X POST "http://localhost:8000/detect" \\
|
192 |
+
-H "Content-Type: application/json" \\
|
193 |
+
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
|
194 |
+
```
|
195 |
+
|
196 |
+
#### Using `curl` with an Image File
|
197 |
+
|
198 |
+
```bash
|
199 |
+
curl -X POST "http://localhost:8000/detect" \\
|
200 |
+
-F "file=@/path/to/image.jpg" \\
|
201 |
+
-F "model_name=facebook/detr-resnet-50"
|
202 |
+
```
|
203 |
+
|
204 |
+
### Response Format
|
205 |
+
|
206 |
+
The response includes a base64-encoded image with detections and detection details:
|
207 |
+
|
208 |
+
```json
|
209 |
+
{
|
210 |
+
"image_url": "data:image/png;base64,...",
|
211 |
+
"detected_objects": ["person", "car"],
|
212 |
+
"confidence_scores": [0.95, 0.87],
|
213 |
+
"unique_objects": ["person", "car"],
|
214 |
+
"unique_confidence_scores": [0.95, 0.87]
|
215 |
+
}
|
216 |
+
```
|
217 |
+
|
218 |
+
### Notes
|
219 |
+
|
220 |
+
- Ensure only one of `file` or `image_url` is provided.
|
221 |
+
- The API may experience instability with panoptic models; use object detection models for reliability.
|
222 |
+
- Test the API using the Swagger UI for easier debugging.
|
223 |
+
|
224 |
+
## Development Setup
|
225 |
+
|
226 |
+
To contribute or modify the application:
|
227 |
+
|
228 |
+
1. Clone the repository:
|
229 |
+
|
230 |
+
```bash
|
231 |
+
git clone https://github.com/JaishnaCodz/ObjectDetection
|
232 |
+
cd ObjectDetection
|
233 |
+
```
|
234 |
+
|
235 |
+
2. Install dependencies:
|
236 |
+
|
237 |
+
```bash
|
238 |
+
pip install -r requirements.txt
|
239 |
+
```
|
240 |
+
|
241 |
+
3. Run the application:
|
242 |
+
|
243 |
+
```bash
|
244 |
+
python app.py
|
245 |
+
```
|
246 |
+
|
247 |
+
Or run FastAPI:
|
248 |
+
|
249 |
+
```bash
|
250 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
251 |
+
```
|
252 |
+
|
253 |
+
4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
|
254 |
+
|
255 |
+
## Contributing
|
256 |
+
|
257 |
+
Contributions are welcome! To contribute:
|
258 |
+
|
259 |
+
1. Fork the repository.
|
260 |
+
2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
|
261 |
+
3. Commit changes (`git commit -m "Add your feature"`).
|
262 |
+
4. Push to the branch (`git push origin feature/your-feature`).
|
263 |
+
5. Open a pull request on the [GitHub repository](https://github.com/JaishnaCodz/ObjectDetection).
|
264 |
+
|
265 |
+
Please include tests and documentation for new features. Report issues via GitHub Issues.
|
266 |
+
|
267 |
+
## Troubleshooting
|
268 |
+
|
269 |
+
- **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
|
270 |
+
- Example: `python app.py --gradio-port 7870`
|
271 |
+
- **Colab Asyncio Error**: If you encounter `RuntimeError: asyncio.run() cannot be called from a running event loop` in Colab, the application now uses `nest_asyncio` to handle this. Ensure `nest_asyncio` is installed (`pip install nest_asyncio`).
|
272 |
+
- **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
|
273 |
+
- **API Instability**: Test with smaller images and object detection models first.
|
274 |
+
- **FastAPI Not Starting**: Ensure `--enable-fastapi` is used, and check that the specified `--fastapi-port` (default: 8000) is available.
|
275 |
+
|
276 |
+
For further assistance, open an issue on the [GitHub repository](https://github.com/JaishnaCodz/ObjectDetection).
|
hf_space/hf_space/.github/workflows/docker-build-push.yml
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Build and Push Docker Image to Docker Hub
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches:
|
6 |
+
- main
|
7 |
+
|
8 |
+
jobs:
|
9 |
+
build-and-push:
|
10 |
+
runs-on: ubuntu-latest
|
11 |
+
steps:
|
12 |
+
- name: Checkout code
|
13 |
+
uses: actions/checkout@v4
|
14 |
+
|
15 |
+
- name: Log in to Docker Hub
|
16 |
+
uses: docker/login-action@v3
|
17 |
+
with:
|
18 |
+
username: ${{ secrets.DOCKER_USERNAME }}
|
19 |
+
password: ${{ secrets.DOCKER_PAT }}
|
20 |
+
|
21 |
+
- name: Build and push Docker image
|
22 |
+
uses: docker/build-push-action@v6
|
23 |
+
with:
|
24 |
+
context: .
|
25 |
+
push: true
|
26 |
+
tags: ${{ secrets.DOCKER_USERNAME }}/objectdetection:latest
|
hf_space/hf_space/.github/workflows/hf-space-sync.yml
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Sync to Hugging Face Space
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches: [ main ]
|
6 |
+
|
7 |
+
jobs:
|
8 |
+
deploy-to-hf-space:
|
9 |
+
runs-on: ubuntu-latest
|
10 |
+
|
11 |
+
steps:
|
12 |
+
- name: Checkout Repository
|
13 |
+
uses: actions/checkout@v3
|
14 |
+
|
15 |
+
- name: Install Git
|
16 |
+
run: sudo apt-get install git
|
17 |
+
|
18 |
+
- name: Push to Hugging Face Space
|
19 |
+
env:
|
20 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
21 |
+
run: |
|
22 |
+
git config --global user.email "[email protected]"
|
23 |
+
git config --global user.name "JaishnaCodz"
|
24 |
+
|
25 |
+
git clone https://JaishnaCodz:[email protected]/spaces/JaishnaCodz/ObjectDetection hf_space
|
26 |
+
rsync -av --exclude='.git' ./ hf_space/
|
27 |
+
cd hf_space
|
28 |
+
git add .
|
29 |
+
git commit -m "Sync from GitHub"
|
30 |
+
git push
|
hf_space/hf_space/Dockerfile
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.11-slim
|
2 |
+
|
3 |
+
WORKDIR /app
|
4 |
+
|
5 |
+
COPY requirements.txt .
|
6 |
+
|
7 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
8 |
+
|
9 |
+
COPY app.py .
|
10 |
+
|
11 |
+
EXPOSE 5000
|
12 |
+
|
13 |
+
CMD ["python", "app.py"]
|
hf_space/hf_space/LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
MIT License
|
2 |
+
|
3 |
+
Copyright (c) 2025 Jaishna S
|
4 |
+
|
5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6 |
+
of this software and associated documentation files (the "Software"), to deal
|
7 |
+
in the Software without restriction, including without limitation the rights
|
8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9 |
+
copies of the Software, and to permit persons to whom the Software is
|
10 |
+
furnished to do so, subject to the following conditions:
|
11 |
+
|
12 |
+
The above copyright notice and this permission notice shall be included in all
|
13 |
+
copies or substantial portions of the Software.
|
14 |
+
|
15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21 |
+
SOFTWARE.
|
hf_space/hf_space/app.py
ADDED
@@ -0,0 +1,384 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import torch
|
3 |
+
from transformers import DetrImageProcessor, DetrForObjectDetection
|
4 |
+
from transformers import YolosImageProcessor, YolosForObjectDetection
|
5 |
+
from transformers import DetrForSegmentation
|
6 |
+
from PIL import Image, ImageDraw, ImageStat
|
7 |
+
import requests
|
8 |
+
from io import BytesIO
|
9 |
+
import base64
|
10 |
+
from collections import Counter
|
11 |
+
import logging
|
12 |
+
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
13 |
+
from fastapi.responses import JSONResponse
|
14 |
+
import uvicorn
|
15 |
+
import pandas as pd
|
16 |
+
import traceback
|
17 |
+
import os
|
18 |
+
|
19 |
+
# Set up logging
|
20 |
+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
21 |
+
logger = logging.getLogger(__name__)
|
22 |
+
|
23 |
+
# Constants
|
24 |
+
CONFIDENCE_THRESHOLD = 0.5
|
25 |
+
VALID_MODELS = [
|
26 |
+
"facebook/detr-resnet-50",
|
27 |
+
"facebook/detr-resnet-101",
|
28 |
+
"facebook/detr-resnet-50-panoptic",
|
29 |
+
"facebook/detr-resnet-101-panoptic",
|
30 |
+
"hustvl/yolos-tiny",
|
31 |
+
"hustvl/yolos-base"
|
32 |
+
]
|
33 |
+
MODEL_DESCRIPTIONS = {
|
34 |
+
"facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
|
35 |
+
"facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
|
36 |
+
"facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
|
37 |
+
"facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
|
38 |
+
"hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
|
39 |
+
"hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
|
40 |
+
}
|
41 |
+
|
42 |
+
# Lazy model loading
|
43 |
+
models = {}
|
44 |
+
processors = {}
|
45 |
+
|
46 |
+
def process(image, model_name):
|
47 |
+
"""Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
|
48 |
+
try:
|
49 |
+
if model_name not in VALID_MODELS:
|
50 |
+
raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
|
51 |
+
|
52 |
+
# Load model and processor
|
53 |
+
if model_name not in models:
|
54 |
+
logger.info(f"Loading model: {model_name}")
|
55 |
+
if "yolos" in model_name:
|
56 |
+
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
57 |
+
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
58 |
+
elif "panoptic" in model_name:
|
59 |
+
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
60 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
61 |
+
else:
|
62 |
+
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
63 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
64 |
+
|
65 |
+
model, processor = models[model_name], processors[model_name]
|
66 |
+
inputs = processor(images=image, return_tensors="pt")
|
67 |
+
|
68 |
+
with torch.no_grad():
|
69 |
+
outputs = model(**inputs)
|
70 |
+
|
71 |
+
target_sizes = torch.tensor([image.size[::-1]])
|
72 |
+
draw = ImageDraw.Draw(image)
|
73 |
+
object_names = []
|
74 |
+
confidence_scores = []
|
75 |
+
object_counter = Counter()
|
76 |
+
|
77 |
+
if "panoptic" in model_name:
|
78 |
+
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
79 |
+
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
80 |
+
|
81 |
+
for segment in results["segments_info"]:
|
82 |
+
label = segment["label_id"]
|
83 |
+
label_name = model.config.id2label.get(label, "Unknown")
|
84 |
+
score = segment.get("score", 1.0)
|
85 |
+
|
86 |
+
if "masks" in results and segment["id"] < len(results["masks"]):
|
87 |
+
mask = results["masks"][segment["id"]].cpu().numpy()
|
88 |
+
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
89 |
+
mask_image = Image.fromarray((mask * 255).astype("uint8"))
|
90 |
+
colored_mask = Image.new("RGBA", image.size, (0, 0, 0, 0))
|
91 |
+
mask_draw = ImageDraw.Draw(colored_mask)
|
92 |
+
r, g, b = (segment["id"] * 50) % 255, (segment["id"] * 100) % 255, (segment["id"] * 150) % 255
|
93 |
+
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
94 |
+
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
95 |
+
draw = ImageDraw.Draw(image)
|
96 |
+
|
97 |
+
if score > CONFIDENCE_THRESHOLD:
|
98 |
+
object_names.append(label_name)
|
99 |
+
confidence_scores.append(float(score))
|
100 |
+
object_counter[label_name] = float(score)
|
101 |
+
else:
|
102 |
+
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
103 |
+
|
104 |
+
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
105 |
+
if score > CONFIDENCE_THRESHOLD:
|
106 |
+
x, y, x2, y2 = box.tolist()
|
107 |
+
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
108 |
+
label_name = model.config.id2label.get(label.item(), "Unknown")
|
109 |
+
# Place text at top-right corner, outside the box, with smaller size
|
110 |
+
text = f"{label_name}: {score:.2f}"
|
111 |
+
text_bbox = draw.textbbox((0, 0), text)
|
112 |
+
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
113 |
+
draw.text((x2 - text_width - 2, y - text_height - 2), text, fill="#32CD32")
|
114 |
+
object_names.append(label_name)
|
115 |
+
confidence_scores.append(float(score))
|
116 |
+
object_counter[label_name] = float(score)
|
117 |
+
|
118 |
+
unique_objects = list(object_counter.keys())
|
119 |
+
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
120 |
+
|
121 |
+
# Image properties
|
122 |
+
file_size = "Unknown"
|
123 |
+
if hasattr(image, "fp") and image.fp is not None:
|
124 |
+
buffered = BytesIO()
|
125 |
+
image.save(buffered, format="PNG")
|
126 |
+
file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
127 |
+
|
128 |
+
# Color statistics
|
129 |
+
try:
|
130 |
+
stat = ImageStat.Stat(image)
|
131 |
+
color_stats = {
|
132 |
+
"mean": [f"{m:.2f}" for m in stat.mean],
|
133 |
+
"stddev": [f"{s:.2f}" for s in stat.stddev]
|
134 |
+
}
|
135 |
+
except Exception as e:
|
136 |
+
logger.error(f"Error calculating color statistics: {str(e)}")
|
137 |
+
color_stats = {"mean": "Error", "stddev": "Error"}
|
138 |
+
|
139 |
+
properties = {
|
140 |
+
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
141 |
+
"Size": f"{image.width}x{image.height}",
|
142 |
+
"Width": f"{image.width} px",
|
143 |
+
"Height": f"{image.height} px",
|
144 |
+
"Mode": image.mode,
|
145 |
+
"Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
|
146 |
+
"File Size": file_size,
|
147 |
+
"Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
|
148 |
+
"StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
|
149 |
+
}
|
150 |
+
|
151 |
+
return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
|
152 |
+
except Exception as e:
|
153 |
+
logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
|
154 |
+
raise
|
155 |
+
|
156 |
+
# FastAPI Setup
|
157 |
+
app = FastAPI(title="Object Detection API")
|
158 |
+
|
159 |
+
@app.post("/detect")
|
160 |
+
async def detect_objects_endpoint(
|
161 |
+
file: UploadFile = File(None),
|
162 |
+
image_url: str = Form(None),
|
163 |
+
model_name: str = Form(VALID_MODELS[0])
|
164 |
+
):
|
165 |
+
"""FastAPI endpoint to detect objects in an image from file or URL."""
|
166 |
+
try:
|
167 |
+
if (file is None and not image_url) or (file is not None and image_url):
|
168 |
+
raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
|
169 |
+
|
170 |
+
if file:
|
171 |
+
if not file.content_type.startswith("image/"):
|
172 |
+
raise HTTPException(status_code=400, detail="File must be an image")
|
173 |
+
contents = await file.read()
|
174 |
+
image = Image.open(BytesIO(contents)).convert("RGB")
|
175 |
+
else:
|
176 |
+
response = requests.get(image_url, timeout=10)
|
177 |
+
response.raise_for_status()
|
178 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
179 |
+
|
180 |
+
if model_name not in VALID_MODELS:
|
181 |
+
raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
|
182 |
+
|
183 |
+
detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
|
184 |
+
|
185 |
+
buffered = BytesIO()
|
186 |
+
detected_image.save(buffered, format="PNG")
|
187 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
188 |
+
img_url = f"data:image/png;base64,{img_base64}"
|
189 |
+
|
190 |
+
return JSONResponse(content={
|
191 |
+
"image_url": img_url,
|
192 |
+
"detected_objects": detected_objects,
|
193 |
+
"confidence_scores": detected_confidences,
|
194 |
+
"unique_objects": unique_objects,
|
195 |
+
"unique_confidence_scores": unique_confidences
|
196 |
+
})
|
197 |
+
except Exception as e:
|
198 |
+
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
199 |
+
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
200 |
+
|
201 |
+
# Gradio UI
|
202 |
+
def create_gradio_ui():
|
203 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
|
204 |
+
gr.Markdown(
|
205 |
+
"""
|
206 |
+
# 🚀 Object Detection App
|
207 |
+
Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
|
208 |
+
"""
|
209 |
+
)
|
210 |
+
|
211 |
+
with gr.Tabs():
|
212 |
+
with gr.Tab("📷 Image Upload"):
|
213 |
+
with gr.Row():
|
214 |
+
with gr.Column(scale=1):
|
215 |
+
gr.Markdown("### Input")
|
216 |
+
model_choice = gr.Dropdown(
|
217 |
+
choices=VALID_MODELS,
|
218 |
+
value=VALID_MODELS[0],
|
219 |
+
label="🔎 Select Model",
|
220 |
+
info="Choose a model for object detection or panoptic segmentation."
|
221 |
+
)
|
222 |
+
model_info = gr.Markdown(
|
223 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
224 |
+
visible=True
|
225 |
+
)
|
226 |
+
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
227 |
+
image_url_input = gr.Textbox(
|
228 |
+
label="🔗 Image URL",
|
229 |
+
placeholder="https://example.com/image.jpg"
|
230 |
+
)
|
231 |
+
with gr.Row():
|
232 |
+
submit_btn = gr.Button("✨ Detect", variant="primary")
|
233 |
+
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
234 |
+
|
235 |
+
model_choice.change(
|
236 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
237 |
+
inputs=model_choice,
|
238 |
+
outputs=model_info
|
239 |
+
)
|
240 |
+
|
241 |
+
with gr.Column(scale=2):
|
242 |
+
gr.Markdown("### Results")
|
243 |
+
error_output = gr.Textbox(
|
244 |
+
label="⚠️ Errors",
|
245 |
+
visible=False,
|
246 |
+
lines=3,
|
247 |
+
max_lines=5
|
248 |
+
)
|
249 |
+
output_image = gr.Image(
|
250 |
+
type="pil",
|
251 |
+
label="🎯 Detected Image",
|
252 |
+
interactive=False
|
253 |
+
)
|
254 |
+
with gr.Row():
|
255 |
+
objects_output = gr.DataFrame(
|
256 |
+
label="📋 Detected Objects",
|
257 |
+
interactive=False,
|
258 |
+
value=None
|
259 |
+
)
|
260 |
+
unique_objects_output = gr.DataFrame(
|
261 |
+
label="🔍 Unique Objects",
|
262 |
+
interactive=False,
|
263 |
+
value=None
|
264 |
+
)
|
265 |
+
properties_output = gr.DataFrame(
|
266 |
+
label="📄 Image Properties",
|
267 |
+
interactive=False,
|
268 |
+
value=None
|
269 |
+
)
|
270 |
+
|
271 |
+
def process_for_gradio(image, url, model_name):
|
272 |
+
try:
|
273 |
+
if image is None and not url:
|
274 |
+
return None, None, None, None, "Please provide an image or URL"
|
275 |
+
if image and url:
|
276 |
+
return None, None, None, None, "Please provide either an image or URL, not both"
|
277 |
+
|
278 |
+
if url:
|
279 |
+
response = requests.get(url, timeout=10)
|
280 |
+
response.raise_for_status()
|
281 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
282 |
+
|
283 |
+
detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
|
284 |
+
objects_df = pd.DataFrame({
|
285 |
+
"Object": objects,
|
286 |
+
"Confidence Score": [f"{score:.2f}" for score in scores]
|
287 |
+
}) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
|
288 |
+
unique_objects_df = pd.DataFrame({
|
289 |
+
"Unique Object": unique_objects,
|
290 |
+
"Confidence Score": [f"{score:.2f}" for score in unique_scores]
|
291 |
+
}) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
292 |
+
properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
|
293 |
+
return detected_image, objects_df, unique_objects_df, properties_df, ""
|
294 |
+
except Exception as e:
|
295 |
+
error_msg = f"Error processing image: {str(e)}"
|
296 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
297 |
+
return None, None, None, None, error_msg
|
298 |
+
|
299 |
+
submit_btn.click(
|
300 |
+
fn=process_for_gradio,
|
301 |
+
inputs=[image_input, image_url_input, model_choice],
|
302 |
+
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
|
303 |
+
)
|
304 |
+
|
305 |
+
clear_btn.click(
|
306 |
+
fn=lambda: [None, "", None, None, None, None],
|
307 |
+
inputs=None,
|
308 |
+
outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
|
309 |
+
)
|
310 |
+
|
311 |
+
with gr.Tab("🔗 URL Input"):
|
312 |
+
gr.Markdown("### Process Image from URL")
|
313 |
+
image_url_input = gr.Textbox(
|
314 |
+
label="🔗 Image URL",
|
315 |
+
placeholder="https://example.com/image.jpg"
|
316 |
+
)
|
317 |
+
url_model_choice = gr.Dropdown(
|
318 |
+
choices=VALID_MODELS,
|
319 |
+
value=VALID_MODELS[0],
|
320 |
+
label="🔎 Select Model"
|
321 |
+
)
|
322 |
+
url_model_info = gr.Markdown(
|
323 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
324 |
+
visible=True
|
325 |
+
)
|
326 |
+
url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
|
327 |
+
url_output = gr.JSON(label="API Response")
|
328 |
+
|
329 |
+
url_model_choice.change(
|
330 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
331 |
+
inputs=url_model_choice,
|
332 |
+
outputs=url_model_info
|
333 |
+
)
|
334 |
+
|
335 |
+
def process_url_for_gradio(url, model_name):
|
336 |
+
try:
|
337 |
+
response = requests.get(url, timeout=10)
|
338 |
+
response.raise_for_status()
|
339 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
340 |
+
detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
|
341 |
+
buffered = BytesIO()
|
342 |
+
detected_image.save(buffered, format="PNG")
|
343 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
344 |
+
return {
|
345 |
+
"image_url": f"data:image/png;base64,{img_base64}",
|
346 |
+
"detected_objects": objects,
|
347 |
+
"confidence_scores": scores,
|
348 |
+
"unique_objects": unique_objects,
|
349 |
+
"unique_confidence_scores": unique_scores
|
350 |
+
}
|
351 |
+
except Exception as e:
|
352 |
+
error_msg = f"Error processing URL: {str(e)}"
|
353 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
354 |
+
return {"error": error_msg}
|
355 |
+
|
356 |
+
url_submit_btn.click(
|
357 |
+
fn=process_url_for_gradio,
|
358 |
+
inputs=[image_url_input, url_model_choice],
|
359 |
+
outputs=[url_output]
|
360 |
+
)
|
361 |
+
|
362 |
+
with gr.Tab("ℹ️ Help"):
|
363 |
+
gr.Markdown(
|
364 |
+
"""
|
365 |
+
## How to Use
|
366 |
+
- **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
|
367 |
+
- **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
|
368 |
+
- **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
|
369 |
+
- **Clear**: Reset all inputs and outputs using the "Clear" button.
|
370 |
+
- **Errors**: Check the error box for any processing issues.
|
371 |
+
|
372 |
+
## Tips
|
373 |
+
- Use high-quality images for better detection results.
|
374 |
+
- Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
|
375 |
+
- For faster processing, try YOLOS-Tiny on resource-constrained devices.
|
376 |
+
"""
|
377 |
+
)
|
378 |
+
|
379 |
+
return demo
|
380 |
+
|
381 |
+
if __name__ == "__main__":
|
382 |
+
demo = create_gradio_ui()
|
383 |
+
demo.launch()
|
384 |
+
# To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
|
hf_space/hf_space/hf_space/.gitattributes
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
hf_space/hf_space/hf_space/README.md
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: ObjectDetection
|
3 |
+
emoji: 🐢
|
4 |
+
colorFrom: red
|
5 |
+
colorTo: blue
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.29.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
+
|
13 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
hf_space/hf_space/requirements.txt
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
transformers
|
2 |
+
torch
|
3 |
+
tensorflow
|
4 |
+
gradio
|
5 |
+
pillow
|
6 |
+
timm
|
7 |
+
fastapi
|
8 |
+
requests
|
requirements.txt
CHANGED
@@ -5,4 +5,7 @@ gradio
|
|
5 |
pillow
|
6 |
timm
|
7 |
fastapi
|
8 |
-
requests
|
|
|
|
|
|
|
|
5 |
pillow
|
6 |
timm
|
7 |
fastapi
|
8 |
+
requests
|
9 |
+
uvicorn
|
10 |
+
pandas
|
11 |
+
nest_asyncio
|