NeerajCodz commited on
Commit
e581bf6
·
1 Parent(s): 074874b

Sync from GitHub

Browse files
README.md CHANGED
@@ -1,54 +1,56 @@
1
  # 🚀 Object Detection with Transformer Models
2
 
3
- This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
4
 
5
- You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
- The following models are supported, as defined in the application:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
- - `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
13
- - `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
14
- - `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
15
- - `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
- - `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
19
- - `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
- - **Image Upload**: Upload images from your device for object detection via the Gradio interface.
24
- - **URL Input**: Input an image URL for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
- - **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
27
- - **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
28
- - **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
29
- - **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
 
30
 
31
  ## How to Use
32
 
33
- ### 1. **Normal Git Clone Method**
34
 
35
  Follow these steps to set up the application locally:
36
 
37
  #### Prerequisites
38
 
39
  - Python 3.8 or higher
40
- - Install dependencies using `pip`
 
41
 
42
  #### Clone the Repository
43
 
44
  ```bash
45
- git clone https://github.com/NeerajCodz/ObjectDetection.git
46
  cd ObjectDetection
47
  ```
48
 
49
  #### Install Dependencies
50
 
51
- Install the required dependencies from `requirements.txt`:
52
 
53
  ```bash
54
  pip install -r requirements.txt
@@ -56,34 +58,34 @@ pip install -r requirements.txt
56
 
57
  #### Run the Application
58
 
59
- Start the FastAPI server using uvicorn:
60
 
61
  ```bash
62
- uvicorn objectdetection:app --reload
63
  ```
64
 
65
- Alternatively, launch the Gradio interface by running the main script:
66
 
67
  ```bash
68
- python app.py
69
  ```
70
 
71
  #### Access the Application
72
 
73
- - For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
74
- - For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
75
 
76
  ### 2. **Running with Docker**
77
 
78
- If you prefer to use Docker to set up and run the application, follow these steps:
79
 
80
  #### Prerequisites
81
 
82
- - Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
83
 
84
- #### Download the docker Image
85
 
86
- First, Pull the docker Image:
87
 
88
  ```bash
89
  docker pull neerajcodz/objectdetection:latest
@@ -91,47 +93,115 @@ docker pull neerajcodz/objectdetection:latest
91
 
92
  #### Run the Docker Container
93
 
94
- Once the image is built, run the application using this command:
95
 
96
  ```bash
97
  docker run -d -p 8080:80 neerajcodz/objectdetection:latest
98
  ```
99
 
100
- This will start the application on port 8080.
101
- Open your browser and go to `http://localhost:8080` to access the interface.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
 
103
  ### 3. **Demo**
104
 
105
- You can try the demo directly online through Hugging Face's Spaces:
106
 
107
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
108
 
109
- ## Using the API (Instable)
110
 
111
- You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
112
 
113
- **Endpoint**: `/detect`
 
 
 
 
 
114
 
115
- **POST**: `/detect`
116
 
117
- **Parameters**:
 
 
118
 
119
- - `file`: (optional) Image file (must be of type `image/*`).
120
- - `image_url`: (optional) URL of the image.
121
- - `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
122
 
123
- **Example Request Body**:
 
 
 
124
 
125
- ```json
126
- {
127
- "image_url": "https://example.com/image.jpg",
128
- "model_name": "facebook/detr-resnet-50"
129
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
  ```
131
 
132
- **Response**:
133
 
134
- The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
 
 
 
 
 
 
 
 
135
 
136
  ```json
137
  {
@@ -143,14 +213,20 @@ The response includes a base64-encoded image with detections, detected objects,
143
  }
144
  ```
145
 
 
 
 
 
 
 
146
  ## Development Setup
147
 
148
- If you'd like to contribute or modify the application:
149
 
150
  1. Clone the repository:
151
 
152
  ```bash
153
- git clone https://github.com/NeerajCodz/ObjectDetection.git
154
  cd ObjectDetection
155
  ```
156
 
@@ -160,20 +236,37 @@ cd ObjectDetection
160
  pip install -r requirements.txt
161
  ```
162
 
163
- 3. Run the FastAPI server or Gradio interface:
164
 
165
  ```bash
166
- uvicorn objectdetection:app --reload
167
  ```
168
 
169
- or
170
 
171
  ```bash
172
- python app.py
173
  ```
174
 
175
- 4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
176
 
177
  ## Contributing
178
 
179
- Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # 🚀 Object Detection with Transformer Models
2
 
3
+ This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
4
 
5
+ Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
+ The application supports the following models, each tailored for specific detection or segmentation tasks:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
+ - `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
13
+ - `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
14
+ - `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
15
+ - `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
+ - `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
19
+ - `hustvl/yolos-base`: Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
+ - **Image Upload**: Upload images via the Gradio interface for object detection.
24
+ - **URL Input**: Provide image URLs for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
+ - **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
27
+ - **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
28
+ - **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
29
+ - **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
30
+ - **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
31
 
32
  ## How to Use
33
 
34
+ ### 1. **Local Setup (Git Clone)**
35
 
36
  Follow these steps to set up the application locally:
37
 
38
  #### Prerequisites
39
 
40
  - Python 3.8 or higher
41
+ - `pip` for installing dependencies
42
+ - Git for cloning the repository
43
 
44
  #### Clone the Repository
45
 
46
  ```bash
47
+ git clone https://github.com/NeerajCodz/ObjectDetection
48
  cd ObjectDetection
49
  ```
50
 
51
  #### Install Dependencies
52
 
53
+ Install required packages from `requirements.txt`:
54
 
55
  ```bash
56
  pip install -r requirements.txt
 
58
 
59
  #### Run the Application
60
 
61
+ Launch the Gradio interface:
62
 
63
  ```bash
64
+ python app.py
65
  ```
66
 
67
+ To enable the FastAPI server:
68
 
69
  ```bash
70
+ python app.py --enable-fastapi
71
  ```
72
 
73
  #### Access the Application
74
 
75
+ - **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
76
+ - **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
77
 
78
  ### 2. **Running with Docker**
79
 
80
+ Use Docker for a containerized setup.
81
 
82
  #### Prerequisites
83
 
84
+ - Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
85
 
86
+ #### Pull the Docker Image
87
 
88
+ Pull the pre-built image from Docker Hub:
89
 
90
  ```bash
91
  docker pull neerajcodz/objectdetection:latest
 
93
 
94
  #### Run the Docker Container
95
 
96
+ Run the application on port 8080:
97
 
98
  ```bash
99
  docker run -d -p 8080:80 neerajcodz/objectdetection:latest
100
  ```
101
 
102
+ Access the interface at `http://localhost:8080`.
103
+
104
+ #### Build and Run the Docker Image
105
+
106
+ To build the Docker image locally:
107
+
108
+ 1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
109
+ 2. Build the image:
110
+
111
+ ```bash
112
+ docker build -t objectdetection:local .
113
+ ```
114
+
115
+ 3. Run the container:
116
+
117
+ ```bash
118
+ docker run -d -p 8080:80 objectdetection:local
119
+ ```
120
+
121
+ Access the interface at `http://localhost:8080`.
122
 
123
  ### 3. **Demo**
124
 
125
+ Try the demo on Hugging Face Spaces:
126
 
127
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
128
 
129
+ ## Command-Line Arguments
130
 
131
+ The `app.py` script supports the following command-line arguments:
132
 
133
+ - `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
134
+ - Example: `python app.py --gradio-port 7870`
135
+ - `--enable-fastapi`: Enable the FastAPI server (disabled by default).
136
+ - Example: `python app.py --enable-fastapi`
137
+ - `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
138
+ - Example: `python app.py --enable-fastapi --fastapi-port 8001`
139
 
140
+ You can combine arguments:
141
 
142
+ ```bash
143
+ python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
144
+ ```
145
 
146
+ Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
 
 
147
 
148
+ ```bash
149
+ export GRADIO_SERVER_PORT=7870
150
+ python app.py
151
+ ```
152
 
153
+ ## Using the API
154
+
155
+ **Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
156
+
157
+ The `/detect` endpoint allows programmatic image processing.
158
+
159
+ ### Running the FastAPI Server
160
+
161
+ Enable FastAPI when launching the script:
162
+
163
+ ```bash
164
+ python app.py --enable-fastapi
165
+ ```
166
+
167
+ Or run FastAPI separately with Uvicorn:
168
+
169
+ ```bash
170
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
171
+ ```
172
+
173
+ Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
174
+
175
+ ### Endpoint Details
176
+
177
+ - **Endpoint**: `POST /detect`
178
+ - **Parameters**:
179
+ - `file`: (optional) Image file (must be `image/*` type).
180
+ - `image_url`: (optional) URL of the image.
181
+ - `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
182
+ - **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
183
+
184
+ ### Example Requests
185
+
186
+ #### Using `curl` with an Image URL
187
+
188
+ ```bash
189
+ curl -X POST "http://localhost:8000/detect" \
190
+ -H "Content-Type: application/json" \
191
+ -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
192
  ```
193
 
194
+ #### Using `curl` with an Image File
195
 
196
+ ```bash
197
+ curl -X POST "http://localhost:8000/detect" \
198
+ -F "file=@/path/to/image.jpg" \
199
+ -F "model_name=facebook/detr-resnet-50"
200
+ ```
201
+
202
+ ### Response Format
203
+
204
+ The response includes a base64-encoded image with detections and detection details:
205
 
206
  ```json
207
  {
 
213
  }
214
  ```
215
 
216
+ ### Notes
217
+
218
+ - Ensure only one of `file` or `image_url` is provided.
219
+ - The API may experience instability with panoptic models; use object detection models for reliability.
220
+ - Test the API using the Swagger UI for easier debugging.
221
+
222
  ## Development Setup
223
 
224
+ To contribute or modify the application:
225
 
226
  1. Clone the repository:
227
 
228
  ```bash
229
+ git clone https://github.com/NeerajCodz/ObjectDetection
230
  cd ObjectDetection
231
  ```
232
 
 
236
  pip install -r requirements.txt
237
  ```
238
 
239
+ 3. Run the application:
240
 
241
  ```bash
242
+ python app.py
243
  ```
244
 
245
+ Or run FastAPI:
246
 
247
  ```bash
248
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
249
  ```
250
 
251
+ 4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
252
 
253
  ## Contributing
254
 
255
+ Contributions are welcome! To contribute:
256
+
257
+ 1. Fork the repository.
258
+ 2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
259
+ 3. Commit changes (`git commit -m "Add your feature"`).
260
+ 4. Push to the branch (`git push origin feature/your-feature`).
261
+ 5. Open a pull request on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
262
+
263
+ Please include tests and documentation for new features. Report issues via GitHub Issues.
264
+
265
+ ## Troubleshooting
266
+
267
+ - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
268
+ - **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab.
269
+ - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
270
+ - **API Instability**: Test with smaller images and object detection models first.
271
+
272
+ For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
app.py CHANGED
@@ -1,79 +1,166 @@
1
- import gradio as gr
2
- import torch
3
- from transformers import DetrImageProcessor, DetrForObjectDetection
4
- from transformers import YolosImageProcessor, YolosForObjectDetection
5
- from transformers import DetrForSegmentation
6
- from PIL import Image, ImageDraw, ImageStat
7
- import requests
8
- from io import BytesIO
9
  import base64
10
- from collections import Counter
11
  import logging
12
- from fastapi import FastAPI, File, UploadFile, HTTPException, Form
13
- from fastapi.responses import JSONResponse
14
- import uvicorn
15
- import pandas as pd
16
- import traceback
17
  import os
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- # Set up logging
20
- logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
 
 
 
21
  logger = logging.getLogger(__name__)
22
 
23
- # Constants
24
- CONFIDENCE_THRESHOLD = 0.5
25
- VALID_MODELS = [
26
  "facebook/detr-resnet-50",
27
  "facebook/detr-resnet-101",
28
  "facebook/detr-resnet-50-panoptic",
29
  "facebook/detr-resnet-101-panoptic",
30
  "hustvl/yolos-tiny",
31
- "hustvl/yolos-base"
32
  ]
33
- MODEL_DESCRIPTIONS = {
34
- "facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
35
- "facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
36
- "facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
37
- "facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
38
- "hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
39
- "hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
 
 
 
 
 
 
 
 
 
 
 
 
40
  }
41
 
42
- # Lazy model loading
43
- models = {}
44
- processors = {}
 
 
45
 
46
- def process(image, model_name):
47
- """Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
48
- try:
49
- if model_name not in VALID_MODELS:
50
- raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
51
 
52
- # Load model and processor
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  if model_name not in models:
54
  logger.info(f"Loading model: {model_name}")
55
- if "yolos" in model_name:
56
- models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
57
- processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
58
- elif "panoptic" in model_name:
59
- models[model_name] = DetrForSegmentation.from_pretrained(model_name)
60
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
61
- else:
62
- models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
63
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
64
-
65
- model, processor = models[model_name], processors[model_name]
66
- inputs = processor(images=image, return_tensors="pt")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  with torch.no_grad():
69
  outputs = model(**inputs)
70
 
71
- target_sizes = torch.tensor([image.size[::-1]])
72
  draw = ImageDraw.Draw(image)
73
- object_names = []
74
- confidence_scores = []
75
  object_counter = Counter()
 
76
 
 
77
  if "panoptic" in model_name:
78
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
79
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
@@ -83,6 +170,7 @@ def process(image, model_name):
83
  label_name = model.config.id2label.get(label, "Unknown")
84
  score = segment.get("score", 1.0)
85
 
 
86
  if "masks" in results and segment["id"] < len(results["masks"]):
87
  mask = results["masks"][segment["id"]].cpu().numpy()
88
  if mask.shape[0] > 0 and mask.shape[1] > 0:
@@ -106,7 +194,6 @@ def process(image, model_name):
106
  x, y, x2, y2 = box.tolist()
107
  draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
108
  label_name = model.config.id2label.get(label.item(), "Unknown")
109
- # Place text at top-right corner, outside the box, with smaller size
110
  text = f"{label_name}: {score:.2f}"
111
  text_bbox = draw.textbbox((0, 0), text)
112
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
@@ -115,58 +202,82 @@ def process(image, model_name):
115
  confidence_scores.append(float(score))
116
  object_counter[label_name] = float(score)
117
 
 
118
  unique_objects = list(object_counter.keys())
119
  unique_confidences = [object_counter[obj] for obj in unique_objects]
120
 
121
- # Image properties
122
- file_size = "Unknown"
123
- if hasattr(image, "fp") and image.fp is not None:
124
- buffered = BytesIO()
125
- image.save(buffered, format="PNG")
126
- file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
127
-
128
- # Color statistics
129
- try:
130
- stat = ImageStat.Stat(image)
131
- color_stats = {
132
- "mean": [f"{m:.2f}" for m in stat.mean],
133
- "stddev": [f"{s:.2f}" for s in stat.stddev]
134
- }
135
- except Exception as e:
136
- logger.error(f"Error calculating color statistics: {str(e)}")
137
- color_stats = {"mean": "Error", "stddev": "Error"}
138
-
139
- properties = {
140
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
141
  "Size": f"{image.width}x{image.height}",
142
  "Width": f"{image.width} px",
143
  "Height": f"{image.height} px",
144
  "Mode": image.mode,
145
- "Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
146
- "File Size": file_size,
147
- "Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
148
- "StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
 
 
149
  }
150
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
151
  return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
 
152
  except Exception as e:
153
  logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
154
- raise
155
 
 
156
  # FastAPI Setup
 
 
157
  app = FastAPI(title="Object Detection API")
158
 
159
  @app.post("/detect")
160
  async def detect_objects_endpoint(
161
- file: UploadFile = File(None),
162
- image_url: str = Form(None),
163
- model_name: str = Form(VALID_MODELS[0])
164
- ):
165
- """FastAPI endpoint to detect objects in an image from file or URL."""
 
 
 
 
 
 
 
 
 
 
 
 
 
166
  try:
 
167
  if (file is None and not image_url) or (file is not None and image_url):
168
- raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
 
 
 
169
 
 
170
  if file:
171
  if not file.content_type.startswith("image/"):
172
  raise HTTPException(status_code=400, detail="File must be an image")
@@ -178,207 +289,454 @@ async def detect_objects_endpoint(
178
  image = Image.open(BytesIO(response.content)).convert("RGB")
179
 
180
  if model_name not in VALID_MODELS:
181
- raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
 
 
 
182
 
183
- detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
 
 
 
184
 
 
185
  buffered = BytesIO()
186
  detected_image.save(buffered, format="PNG")
187
  img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
188
  img_url = f"data:image/png;base64,{img_base64}"
189
 
190
- return JSONResponse(content={
191
- "image_url": img_url,
192
- "detected_objects": detected_objects,
193
- "confidence_scores": detected_confidences,
194
- "unique_objects": unique_objects,
195
- "unique_confidence_scores": unique_confidences
196
- })
 
 
 
 
 
 
197
  except Exception as e:
198
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
199
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
200
 
201
- # Gradio UI
202
- def create_gradio_ui():
203
- with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
204
- gr.Markdown(
205
- """
206
- # 🚀 Object Detection App
207
- Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
208
- """
209
- )
210
-
211
- with gr.Tabs():
212
- with gr.Tab("📷 Image Upload"):
213
- with gr.Row():
214
- with gr.Column(scale=1):
215
- gr.Markdown("### Input")
216
- model_choice = gr.Dropdown(
217
- choices=VALID_MODELS,
218
- value=VALID_MODELS[0],
219
- label="🔎 Select Model",
220
- info="Choose a model for object detection or panoptic segmentation."
221
- )
222
- model_info = gr.Markdown(
223
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
224
- visible=True
225
- )
226
- image_input = gr.Image(type="pil", label="📷 Upload Image")
227
- image_url_input = gr.Textbox(
228
- label="🔗 Image URL",
229
- placeholder="https://example.com/image.jpg"
230
- )
231
- with gr.Row():
232
- submit_btn = gr.Button("✨ Detect", variant="primary")
233
- clear_btn = gr.Button("🗑️ Clear", variant="secondary")
234
-
235
- model_choice.change(
236
- fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
237
- inputs=model_choice,
238
- outputs=model_info
239
- )
240
-
241
- with gr.Column(scale=2):
242
- gr.Markdown("### Results")
243
- error_output = gr.Textbox(
244
- label="⚠️ Errors",
245
- visible=False,
246
- lines=3,
247
- max_lines=5
248
- )
249
- output_image = gr.Image(
250
- type="pil",
251
- label="🎯 Detected Image",
252
- interactive=False
253
- )
254
- with gr.Row():
255
- objects_output = gr.DataFrame(
256
- label="📋 Detected Objects",
 
 
 
 
 
 
 
 
 
 
 
257
  interactive=False,
258
- value=None
259
  )
260
- unique_objects_output = gr.DataFrame(
261
- label="🔍 Unique Objects",
 
 
 
 
 
 
 
 
 
 
 
262
  interactive=False,
263
- value=None
264
  )
265
- properties_output = gr.DataFrame(
266
- label="📄 Image Properties",
267
- interactive=False,
268
- value=None
269
- )
270
-
271
- def process_for_gradio(image, url, model_name):
272
- try:
273
- if image is None and not url:
274
- return None, None, None, None, "Please provide an image or URL"
275
- if image and url:
276
- return None, None, None, None, "Please provide either an image or URL, not both"
277
-
278
- if url:
279
- response = requests.get(url, timeout=10)
280
- response.raise_for_status()
281
- image = Image.open(BytesIO(response.content)).convert("RGB")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
282
 
283
- detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
284
- objects_df = pd.DataFrame({
285
- "Object": objects,
286
- "Confidence Score": [f"{score:.2f}" for score in scores]
287
- }) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
288
- unique_objects_df = pd.DataFrame({
289
- "Unique Object": unique_objects,
290
- "Confidence Score": [f"{score:.2f}" for score in unique_scores]
291
- }) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
292
- properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
293
- return detected_image, objects_df, unique_objects_df, properties_df, ""
294
- except Exception as e:
295
- error_msg = f"Error processing image: {str(e)}"
296
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
297
- return None, None, None, None, error_msg
298
-
299
- submit_btn.click(
300
- fn=process_for_gradio,
301
- inputs=[image_input, image_url_input, model_choice],
302
- outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
303
- )
304
-
305
- clear_btn.click(
306
- fn=lambda: [None, "", None, None, None, None],
307
- inputs=None,
308
- outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
309
- )
310
-
311
- with gr.Tab("🔗 URL Input"):
312
- gr.Markdown("### Process Image from URL")
313
- image_url_input = gr.Textbox(
314
- label="🔗 Image URL",
315
- placeholder="https://example.com/image.jpg"
316
- )
317
- url_model_choice = gr.Dropdown(
318
- choices=VALID_MODELS,
319
- value=VALID_MODELS[0],
320
- label="🔎 Select Model"
321
- )
322
- url_model_info = gr.Markdown(
323
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
324
- visible=True
325
- )
326
- url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
327
- url_output = gr.JSON(label="API Response")
328
-
329
- url_model_choice.change(
330
- fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
331
- inputs=url_model_choice,
332
- outputs=url_model_info
333
- )
334
-
335
- def process_url_for_gradio(url, model_name):
336
- try:
337
- response = requests.get(url, timeout=10)
338
- response.raise_for_status()
339
- image = Image.open(BytesIO(response.content)).convert("RGB")
340
- detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
341
- buffered = BytesIO()
342
- detected_image.save(buffered, format="PNG")
343
- img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
344
- return {
345
- "image_url": f"data:image/png;base64,{img_base64}",
346
- "detected_objects": objects,
347
- "confidence_scores": scores,
348
- "unique_objects": unique_objects,
349
- "unique_confidence_scores": unique_scores
350
- }
351
- except Exception as e:
352
- error_msg = f"Error processing URL: {str(e)}"
353
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
354
- return {"error": error_msg}
355
-
356
- url_submit_btn.click(
357
- fn=process_url_for_gradio,
358
- inputs=[image_url_input, url_model_choice],
359
- outputs=[url_output]
360
- )
361
-
362
- with gr.Tab("ℹ️ Help"):
363
- gr.Markdown(
364
- """
365
- ## How to Use
366
- - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
367
- - **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
368
- - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
369
- - **Clear**: Reset all inputs and outputs using the "Clear" button.
370
- - **Errors**: Check the error box for any processing issues.
371
-
372
- ## Tips
373
- - Use high-quality images for better detection results.
374
- - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
375
- - For faster processing, try YOLOS-Tiny on resource-constrained devices.
376
- """
377
- )
378
-
379
- return demo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
380
 
381
  if __name__ == "__main__":
382
- demo = create_gradio_ui()
383
- demo.launch()
384
- # To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
 
1
+ import argparse
 
 
 
 
 
 
 
2
  import base64
 
3
  import logging
 
 
 
 
 
4
  import os
5
+ import sys
6
+ import traceback
7
+ import threading
8
+ from collections import Counter
9
+ from io import BytesIO
10
+ from typing import Dict, List, Optional, Tuple
11
+
12
+ import gradio as gr
13
+ import pandas as pd
14
+ import requests
15
+ import torch
16
+ import uvicorn
17
+ from fastapi import FastAPI, File, Form, HTTPException, UploadFile
18
+ from fastapi.responses import JSONResponse
19
+ from PIL import Image, ImageDraw, ImageStat
20
+ from transformers import (
21
+ DetrForObjectDetection,
22
+ DetrForSegmentation,
23
+ DetrImageProcessor,
24
+ YolosForObjectDetection,
25
+ YolosImageProcessor,
26
+ )
27
+ import nest_asyncio
28
+
29
+ # ------------------------------
30
+ # Configuration
31
+ # ------------------------------
32
 
33
+ # Logging configuration
34
+ logging.basicConfig(
35
+ level=logging.INFO,
36
+ format="%(asctime)s - %(levelname)s - %(message)s",
37
+ )
38
  logger = logging.getLogger(__name__)
39
 
40
+ # Model and processing constants
41
+ CONFIDENCE_THRESHOLD: float = 0.5
42
+ VALID_MODELS: List[str] = [
43
  "facebook/detr-resnet-50",
44
  "facebook/detr-resnet-101",
45
  "facebook/detr-resnet-50-panoptic",
46
  "facebook/detr-resnet-101-panoptic",
47
  "hustvl/yolos-tiny",
48
+ "hustvl/yolos-base",
49
  ]
50
+ MODEL_DESCRIPTIONS: Dict[str, str] = {
51
+ "facebook/detr-resnet-50": (
52
+ "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use."
53
+ ),
54
+ "facebook/detr-resnet-101": (
55
+ "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50."
56
+ ),
57
+ "facebook/detr-resnet-50-panoptic": (
58
+ "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes."
59
+ ),
60
+ "facebook/detr-resnet-101-panoptic": (
61
+ "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes."
62
+ ),
63
+ "hustvl/yolos-tiny": (
64
+ "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments."
65
+ ),
66
+ "hustvl/yolos-base": (
67
+ "YOLOS Base model. Balances speed and accuracy for object detection."
68
+ ),
69
  }
70
 
71
+ # Port configuration
72
+ DEFAULT_GRADIO_PORT: int = 7860
73
+ DEFAULT_FASTAPI_PORT: int = 8000
74
+ PORT_RANGE: range = range(7860, 7870) # Try ports 7860-7869
75
+ MAX_PORT_ATTEMPTS: int = 10
76
 
77
+ # Thread-safe storage for lazy-loaded models and processors
78
+ models: Dict[str, any] = {}
79
+ processors: Dict[str, any] = {}
80
+ model_lock = threading.Lock()
 
81
 
82
+ # ------------------------------
83
+ # Model Loading
84
+ # ------------------------------
85
+
86
+ def load_model_and_processor(model_name: str) -> Tuple[any, any]:
87
+ """
88
+ Load and cache the specified model and processor thread-safely.
89
+
90
+ Args:
91
+ model_name: Name of the model to load (must be in VALID_MODELS).
92
+
93
+ Returns:
94
+ Tuple containing the loaded model and processor.
95
+
96
+ Raises:
97
+ ValueError: If the model_name is invalid or loading fails.
98
+ """
99
+ with model_lock:
100
  if model_name not in models:
101
  logger.info(f"Loading model: {model_name}")
102
+ try:
103
+ if "yolos" in model_name:
104
+ models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
105
+ processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
106
+ elif "panoptic" in model_name:
107
+ models[model_name] = DetrForSegmentation.from_pretrained(model_name)
108
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
109
+ else:
110
+ models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
111
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
112
+ logger.debug(f"Model {model_name} loaded successfully")
113
+ except Exception as e:
114
+ logger.error(f"Failed to load model {model_name}: {str(e)}")
115
+ raise ValueError(f"Failed to load model: {str(e)}")
116
+ return models[model_name], processors[model_name]
117
+
118
+ # ------------------------------
119
+ # Image Processing
120
+ # ------------------------------
121
+
122
+ def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str], List[float], List[str], List[float], Dict[str, str]]:
123
+ """
124
+ Process an image for object detection or panoptic segmentation.
125
+
126
+ Args:
127
+ image: PIL Image to process.
128
+ model_name: Name of the model to use (must be in VALID_MODELS).
129
+
130
+ Returns:
131
+ Tuple containing:
132
+ - Annotated image (PIL Image).
133
+ - List of detected object names.
134
+ - List of confidence scores for detected objects.
135
+ - List of unique object names.
136
+ - List of confidence scores for unique objects.
137
+ - Dictionary of image properties (format, size, etc.).
138
 
139
+ Raises:
140
+ ValueError: If the model_name is invalid.
141
+ RuntimeError: If processing fails due to model or image issues.
142
+ """
143
+ if model_name not in VALID_MODELS:
144
+ raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
145
+
146
+ try:
147
+ # Load model and processor
148
+ model, processor = load_model_and_processor(model_name)
149
+ logger.debug(f"Processing image with model: {model_name}")
150
+
151
+ # Prepare image for processing
152
+ inputs = processor(images=image, return_tensors="pt")
153
  with torch.no_grad():
154
  outputs = model(**inputs)
155
 
156
+ # Initialize drawing context
157
  draw = ImageDraw.Draw(image)
158
+ object_names: List[str] = []
159
+ confidence_scores: List[float] = []
160
  object_counter = Counter()
161
+ target_sizes = torch.tensor([image.size[::-1]])
162
 
163
+ # Process panoptic segmentation or object detection
164
  if "panoptic" in model_name:
165
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
166
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
 
170
  label_name = model.config.id2label.get(label, "Unknown")
171
  score = segment.get("score", 1.0)
172
 
173
+ # Apply segmentation mask if available
174
  if "masks" in results and segment["id"] < len(results["masks"]):
175
  mask = results["masks"][segment["id"]].cpu().numpy()
176
  if mask.shape[0] > 0 and mask.shape[1] > 0:
 
194
  x, y, x2, y2 = box.tolist()
195
  draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
196
  label_name = model.config.id2label.get(label.item(), "Unknown")
 
197
  text = f"{label_name}: {score:.2f}"
198
  text_bbox = draw.textbbox((0, 0), text)
199
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
 
202
  confidence_scores.append(float(score))
203
  object_counter[label_name] = float(score)
204
 
205
+ # Compile unique objects and confidences
206
  unique_objects = list(object_counter.keys())
207
  unique_confidences = [object_counter[obj] for obj in unique_objects]
208
 
209
+ # Calculate image properties
210
+ properties: Dict[str, str] = {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
211
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
212
  "Size": f"{image.width}x{image.height}",
213
  "Width": f"{image.width} px",
214
  "Height": f"{image.height} px",
215
  "Mode": image.mode,
216
+ "Aspect Ratio": (
217
+ f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined"
218
+ ),
219
+ "File Size": "Unknown",
220
+ "Mean (R,G,B)": "Unknown",
221
+ "StdDev (R,G,B)": "Unknown",
222
  }
223
 
224
+ # Compute file size
225
+ try:
226
+ buffered = BytesIO()
227
+ image.save(buffered, format="PNG")
228
+ properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
229
+ except Exception as e:
230
+ logger.error(f"Error calculating file size: {str(e)}")
231
+
232
+ # Compute color statistics
233
+ try:
234
+ stat = ImageStat.Stat(image)
235
+ properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
236
+ properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
237
+ except Exception as e:
238
+ logger.error(f"Error calculating color statistics: {str(e)}")
239
+
240
  return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
241
+
242
  except Exception as e:
243
  logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
244
+ raise RuntimeError(f"Failed to process image: {str(e)}")
245
 
246
+ # ------------------------------
247
  # FastAPI Setup
248
+ # ------------------------------
249
+
250
  app = FastAPI(title="Object Detection API")
251
 
252
  @app.post("/detect")
253
  async def detect_objects_endpoint(
254
+ file: Optional[UploadFile] = File(None),
255
+ image_url: Optional[str] = Form(None),
256
+ model_name: str = Form(VALID_MODELS[0]),
257
+ ) -> JSONResponse:
258
+ """
259
+ FastAPI endpoint to detect objects in an image from file upload or URL.
260
+
261
+ Args:
262
+ file: Uploaded image file (optional).
263
+ image_url: URL of the image (optional).
264
+ model_name: Model to use for detection (default: first VALID_MODELS entry).
265
+
266
+ Returns:
267
+ JSONResponse containing the processed image (base64), detected objects, and confidences.
268
+
269
+ Raises:
270
+ HTTPException: If input validation fails or processing errors occur.
271
+ """
272
  try:
273
+ # Validate input
274
  if (file is None and not image_url) or (file is not None and image_url):
275
+ raise HTTPException(
276
+ status_code=400,
277
+ detail="Provide either an image file or an image URL, not both.",
278
+ )
279
 
280
+ # Load image
281
  if file:
282
  if not file.content_type.startswith("image/"):
283
  raise HTTPException(status_code=400, detail="File must be an image")
 
289
  image = Image.open(BytesIO(response.content)).convert("RGB")
290
 
291
  if model_name not in VALID_MODELS:
292
+ raise HTTPException(
293
+ status_code=400,
294
+ detail=f"Invalid model. Choose from: {VALID_MODELS}",
295
+ )
296
 
297
+ # Process image
298
+ detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(
299
+ image, model_name
300
+ )
301
 
302
+ # Encode image as base64
303
  buffered = BytesIO()
304
  detected_image.save(buffered, format="PNG")
305
  img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
306
  img_url = f"data:image/png;base64,{img_base64}"
307
 
308
+ return JSONResponse(
309
+ content={
310
+ "image_url": img_url,
311
+ "detected_objects": detected_objects,
312
+ "confidence_scores": detected_confidences,
313
+ "unique_objects": unique_objects,
314
+ "unique_confidence_scores": unique_confidences,
315
+ }
316
+ )
317
+
318
+ except requests.RequestException as e:
319
+ logger.error(f"Error fetching image from URL: {str(e)}")
320
+ raise HTTPException(status_code=400, detail=f"Failed to fetch image: {str(e)}")
321
  except Exception as e:
322
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
323
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
324
 
325
+ # ------------------------------
326
+ # Gradio UI Setup
327
+ # ------------------------------
328
+
329
+ def create_gradio_ui() -> gr.Blocks:
330
+ """
331
+ Create and configure the Gradio UI for object detection.
332
+
333
+ Returns:
334
+ Gradio Blocks object representing the UI.
335
+
336
+ Raises:
337
+ RuntimeError: If UI creation fails.
338
+ """
339
+ try:
340
+ with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as app:
341
+ gr.Markdown(
342
+ f"""
343
+ # 🚀 Object Detection App
344
+ Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
345
+ Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
346
+ """
347
+ )
348
+
349
+ with gr.Tabs():
350
+ with gr.Tab("📷 Image Upload"):
351
+ with gr.Row():
352
+ with gr.Column(scale=1):
353
+ gr.Markdown("### Input")
354
+ model_choice = gr.Dropdown(
355
+ choices=VALID_MODELS,
356
+ value=VALID_MODELS[0],
357
+ label="🔎 Select Model",
358
+ info="Choose a model for object detection or panoptic segmentation.",
359
+ )
360
+ model_info = gr.Markdown(
361
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
362
+ visible=True,
363
+ )
364
+ image_input = gr.Image(type="pil", label="📷 Upload Image")
365
+ image_url_input = gr.Textbox(
366
+ label="🔗 Image URL",
367
+ placeholder="https://example.com/image.jpg",
368
+ )
369
+ with gr.Row():
370
+ submit_btn = gr.Button("✨ Detect", variant="primary")
371
+ clear_btn = gr.Button("🗑️ Clear", variant="secondary")
372
+
373
+ model_choice.change(
374
+ fn=lambda model_name: (
375
+ f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
376
+ ),
377
+ inputs=model_choice,
378
+ outputs=model_info,
379
+ )
380
+
381
+ with gr.Column(scale=2):
382
+ gr.Markdown("### Results")
383
+ error_output = gr.Textbox(
384
+ label="⚠️ Errors",
385
+ visible=False,
386
+ lines=3,
387
+ max_lines=5,
388
+ )
389
+ output_image = gr.Image(
390
+ type="pil",
391
+ label="🎯 Detected Image",
392
  interactive=False,
 
393
  )
394
+ with gr.Row():
395
+ objects_output = gr.DataFrame(
396
+ label="📋 Detected Objects",
397
+ interactive=False,
398
+ value=None,
399
+ )
400
+ unique_objects_output = gr.DataFrame(
401
+ label="🔍 Unique Objects",
402
+ interactive=False,
403
+ value=None,
404
+ )
405
+ properties_output = gr.DataFrame(
406
+ label="📄 Image Properties",
407
  interactive=False,
408
+ value=None,
409
  )
410
+
411
+ def process_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Tuple[
412
+ Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str
413
+ ]:
414
+ """
415
+ Process image for Gradio UI and return results.
416
+
417
+ Args:
418
+ image: Uploaded PIL Image (optional).
419
+ url: Image URL (optional).
420
+ model_name: Model to use for detection.
421
+
422
+ Returns:
423
+ Tuple of detected image, objects DataFrame, unique objects DataFrame, properties DataFrame, and error message.
424
+ """
425
+ try:
426
+ if image is None and not url:
427
+ return None, None, None, None, "Please provide an image or URL"
428
+ if image and url:
429
+ return None, None, None, None, "Please provide either an image or URL, not both"
430
+
431
+ if url:
432
+ response = requests.get(url, timeout=10)
433
+ response.raise_for_status()
434
+ image = Image.open(BytesIO(response.content)).convert("RGB")
435
+
436
+ detected_image, objects, scores, unique_objects, unique_scores, properties = process(
437
+ image, model_name
438
+ )
439
+ objects_df = (
440
+ pd.DataFrame(
441
+ {
442
+ "Object": objects,
443
+ "Confidence Score": [f"{score:.2f}" for score in scores],
444
+ }
445
+ )
446
+ if objects
447
+ else pd.DataFrame(columns=["Object", "Confidence Score"])
448
+ )
449
+ unique_objects_df = (
450
+ pd.DataFrame(
451
+ {
452
+ "Unique Object": unique_objects,
453
+ "Confidence Score": [f"{score:.2f}" for score in unique_scores],
454
+ }
455
+ )
456
+ if unique_objects
457
+ else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
458
+ )
459
+ properties_df = (
460
+ pd.DataFrame([properties])
461
+ if properties
462
+ else pd.DataFrame(columns=properties.keys())
463
+ )
464
+ return detected_image, objects_df, unique_objects_df, properties_df, ""
465
+
466
+ except requests.RequestException as e:
467
+ error_msg = f"Error fetching image from URL: {str(e)}"
468
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
469
+ return None, None, None, None, error_msg
470
+ except Exception as e:
471
+ error_msg = f"Error processing image: {str(e)}"
472
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
473
+ return None, None, None, None, error_msg
474
+
475
+ submit_btn.click(
476
+ fn=process_for_gradio,
477
+ inputs=[image_input, image_url_input, model_choice],
478
+ outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
479
+ )
480
+
481
+ clear_btn.click(
482
+ fn=lambda: [None, "", None, None, None, None],
483
+ inputs=None,
484
+ outputs=[
485
+ image_input,
486
+ image_url_input,
487
+ output_image,
488
+ objects_output,
489
+ unique_objects_output,
490
+ properties_output,
491
+ error_output,
492
+ ],
493
+ )
494
+
495
+ with gr.Tab("🔗 JSON Output"):
496
+ gr.Markdown("### Process Image for JSON Output")
497
+ image_input_json = gr.Image(type="pil", label="📷 Upload Image")
498
+ image_url_input_json = gr.Textbox(
499
+ label="🔗 Image URL",
500
+ placeholder="https://example.com/image.jpg",
501
+ )
502
+ url_model_choice = gr.Dropdown(
503
+ choices=VALID_MODELS,
504
+ value=VALID_MODELS[0],
505
+ label="🔎 Select Model",
506
+ )
507
+ url_model_info = gr.Markdown(
508
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
509
+ visible=True,
510
+ )
511
+ url_submit_btn = gr.Button("🔄 Process", variant="primary")
512
+ url_output = gr.JSON(label="API Response")
513
+
514
+ url_model_choice.change(
515
+ fn=lambda model_name: (
516
+ f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
517
+ ),
518
+ inputs=url_model_choice,
519
+ outputs=url_model_info,
520
+ )
521
+
522
+ def process_url_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Dict:
523
+ """
524
+ Process image from file or URL for Gradio UI and return JSON response.
525
+
526
+ Args:
527
+ image: Uploaded PIL Image (optional).
528
+ url: Image URL (optional).
529
+ model_name: Model to use for detection.
530
+
531
+ Returns:
532
+ Dictionary with processed image (base64), detected objects, and confidences.
533
+ """
534
+ try:
535
+ if image is None and not url:
536
+ return {"error": "Please provide an image or URL"}
537
+ if image and url:
538
+ return {"error": "Please provide either an image or URL, not both"}
539
+
540
+ if url:
541
+ response = requests.get(url, timeout=10)
542
+ response.raise_for_status()
543
+ image = Image.open(BytesIO(response.content)).convert("RGB")
544
+
545
+ detected_image, objects, scores, unique_objects, unique_scores, _ = process(
546
+ image, model_name
547
+ )
548
+ buffered = BytesIO()
549
+ detected_image.save(buffered, format="PNG")
550
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
551
+ return {
552
+ "image_url": f"data:image/png;base64,{img_base64}",
553
+ "detected_objects": objects,
554
+ "confidence_scores": scores,
555
+ "unique_objects": unique_objects,
556
+ "unique_confidence_scores": unique_scores,
557
+ }
558
+ except requests.RequestException as e:
559
+ error_msg = f"Error fetching image from URL: {str(e)}"
560
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
561
+ return {"error": error_msg}
562
+ except Exception as e:
563
+ error_msg = f"Error processing image: {str(e)}"
564
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
565
+ return {"error": error_msg}
566
+
567
+ url_submit_btn.click(
568
+ fn=process_url_for_gradio,
569
+ inputs=[image_input_json, image_url_input_json, url_model_choice],
570
+ outputs=[url_output],
571
+ )
572
+
573
+ with gr.Tab("ℹ️ Help"):
574
+ gr.Markdown(
575
+ """
576
+ ## How to Use
577
+ - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
578
+ - **JSON Output**: Upload an image or enter a URL, select a model, and click "Process" to get results in JSON format.
579
+ - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
580
+ - **Clear**: Reset all inputs and outputs using the "Clear" button in the Image Upload tab.
581
+ - **Errors**: Check the error box (Image Upload) or JSON response (JSON Output) for issues.
582
 
583
+ ## Tips
584
+ - Use high-quality images for better detection results.
585
+ - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
586
+ - For faster processing, try YOLOS-Tiny on resource-constrained devices.
587
+ """
588
+ )
589
+
590
+ return app
591
+
592
+ except Exception as e:
593
+ logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
594
+ raise RuntimeError(f"Failed to create Gradio UI: {str(e)}")
595
+
596
+ # ------------------------------
597
+ # Launcher
598
+ # ------------------------------
599
+
600
+ def parse_args() -> argparse.Namespace:
601
+ """
602
+ Parse command-line arguments with defaults and ignore unrecognized arguments.
603
+
604
+ Returns:
605
+ Parsed arguments as a Namespace object.
606
+
607
+ Raises:
608
+ SystemExit: If argument parsing fails (handled by argparse).
609
+ """
610
+ parser = argparse.ArgumentParser(
611
+ description="Launcher for Object Detection App with Gradio UI and optional FastAPI server."
612
+ )
613
+ parser.add_argument(
614
+ "--gradio-port",
615
+ type=int,
616
+ default=DEFAULT_GRADIO_PORT,
617
+ help=f"Port for the Gradio UI (default: {DEFAULT_GRADIO_PORT}).",
618
+ )
619
+ parser.add_argument(
620
+ "--enable-fastapi",
621
+ action="store_true",
622
+ default=False,
623
+ help="Enable the FastAPI server (disabled by default).",
624
+ )
625
+ parser.add_argument(
626
+ "--fastapi-port",
627
+ type=int,
628
+ default=DEFAULT_FASTAPI_PORT,
629
+ help=f"Port for the FastAPI server if enabled (default: {DEFAULT_FASTAPI_PORT}).",
630
+ )
631
+
632
+ # Parse known arguments and ignore unrecognized ones (e.g., Jupyter kernel args)
633
+ args, _ = parser.parse_known_args()
634
+ return args
635
+
636
+ def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
637
+ """
638
+ Find an available port within the specified range.
639
+
640
+ Args:
641
+ start_port: Initial port to try (e.g., from args or environment).
642
+ port_range: Range of ports to attempt.
643
+ max_attempts: Maximum number of ports to try.
644
+
645
+ Returns:
646
+ Available port number, or None if no port is found.
647
+
648
+ Raises:
649
+ OSError: If port binding fails for reasons other than port in use.
650
+ """
651
+ import socket
652
+
653
+ port = start_port
654
+ attempts = 0
655
+
656
+ # Check environment variable GRADIO_SERVER_PORT
657
+ env_port = os.getenv("GRADIO_SERVER_PORT")
658
+ if env_port and env_port.isdigit():
659
+ port = int(env_port)
660
+ logger.info(f"Using GRADIO_SERVER_PORT from environment: {port}")
661
+
662
+ while attempts < max_attempts:
663
+ with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
664
+ try:
665
+ s.bind(("0.0.0.0", port))
666
+ logger.debug(f"Port {port} is available")
667
+ return port
668
+ except OSError as e:
669
+ if e.errno == 98: # Port in use
670
+ logger.debug(f"Port {port} is in use")
671
+ port = port + 1 if port < max(port_range) else min(port_range)
672
+ attempts += 1
673
+ else:
674
+ raise
675
+ except Exception as e:
676
+ logger.error(f"Error checking port {port}: {str(e)}")
677
+ raise
678
+ logger.error(f"No available port found in range {min(port_range)}-{max(port_range)} after {max_attempts} attempts")
679
+ return None
680
+
681
+ def run_fastapi_server(host: str, port: int) -> None:
682
+ """
683
+ Run the FastAPI server using Uvicorn.
684
+
685
+ Args:
686
+ host: Host address for the FastAPI server.
687
+ port: Port for the FastAPI server.
688
+ """
689
+ try:
690
+ uvicorn.run(app, host=host, port=port)
691
+ except Exception as e:
692
+ logger.error(f"Error running FastAPI server: {str(e)}\n{traceback.format_exc()}")
693
+ sys.exit(1)
694
+
695
+ def main() -> None:
696
+ """
697
+ Main function to launch Gradio UI and optional FastAPI server.
698
+
699
+ Raises:
700
+ SystemExit: If the application is interrupted or encounters an error.
701
+ """
702
+ try:
703
+ # Apply nest_asyncio to allow nested event loops in Jupyter/Colab
704
+ nest_asyncio.apply()
705
+
706
+ # Parse command-line arguments
707
+ args = parse_args()
708
+ logger.info(f"Parsed arguments: {args}")
709
+
710
+ # Find available port for Gradio
711
+ gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
712
+ if gradio_port is None:
713
+ logger.error("Failed to find an available port for Gradio UI")
714
+ sys.exit(1)
715
+
716
+ # Launch FastAPI server in a separate thread if enabled
717
+ if args.enable_fastapi:
718
+ logger.info(f"Starting FastAPI server on port {args.fastapi_port}")
719
+ fastapi_thread = threading.Thread(
720
+ target=run_fastapi_server,
721
+ args=("0.0.0.0", args.fastapi_port),
722
+ daemon=True
723
+ )
724
+ fastapi_thread.start()
725
+
726
+ # Launch Gradio UI
727
+ logger.info(f"Starting Gradio UI on port {gradio_port}")
728
+ app = create_gradio_ui()
729
+ app.launch(server_port=gradio_port, server_name="0.0.0.0")
730
+
731
+ except KeyboardInterrupt:
732
+ logger.info("Application terminated by user.")
733
+ sys.exit(0)
734
+ except OSError as e:
735
+ logger.error(f"Port binding error: {str(e)}")
736
+ sys.exit(1)
737
+ except Exception as e:
738
+ logger.error(f"Error running application: {str(e)}\n{traceback.format_exc()}")
739
+ sys.exit(1)
740
 
741
  if __name__ == "__main__":
742
+ main()
 
 
hf_space/README.md CHANGED
@@ -1,54 +1,56 @@
1
  # 🚀 Object Detection with Transformer Models
2
 
3
- This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
4
 
5
- You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
- The following models are supported, as defined in the application:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
- - `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
13
- - `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
14
- - `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
15
- - `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
- - `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
19
- - `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
- - **Image Upload**: Upload images from your device for object detection via the Gradio interface.
24
- - **URL Input**: Input an image URL for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
- - **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
27
- - **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
28
- - **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
29
- - **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
 
30
 
31
  ## How to Use
32
 
33
- ### 1. **Normal Git Clone Method**
34
 
35
  Follow these steps to set up the application locally:
36
 
37
  #### Prerequisites
38
 
39
  - Python 3.8 or higher
40
- - Install dependencies using `pip`
 
41
 
42
  #### Clone the Repository
43
 
44
  ```bash
45
- git clone https://github.com/NeerajCodz/ObjectDetection.git
46
  cd ObjectDetection
47
  ```
48
 
49
  #### Install Dependencies
50
 
51
- Install the required dependencies from `requirements.txt`:
52
 
53
  ```bash
54
  pip install -r requirements.txt
@@ -56,88 +58,150 @@ pip install -r requirements.txt
56
 
57
  #### Run the Application
58
 
59
- Start the FastAPI server using uvicorn:
60
 
61
  ```bash
62
- uvicorn objectdetection:app --reload
63
  ```
64
 
65
- Alternatively, launch the Gradio interface by running the main script:
66
 
67
  ```bash
68
- python app.py
69
  ```
70
 
71
  #### Access the Application
72
 
73
- - For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
74
- - For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
75
 
76
  ### 2. **Running with Docker**
77
 
78
- If you prefer to use Docker to set up and run the application, follow these steps:
79
 
80
  #### Prerequisites
81
 
82
- - Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
83
 
84
- #### Build the Docker Image
85
 
86
- First, clone the repository (if you haven't already):
87
 
88
  ```bash
89
- git clone https://github.com/NeerajCodz/ObjectDetection.git
90
- cd ObjectDetection
91
  ```
92
 
93
- Now, build the Docker image:
 
 
94
 
95
  ```bash
96
- docker build -t objectdetection:latest .
97
  ```
98
 
99
- #### Run the Docker Container
 
 
100
 
101
- Once the image is built, run the application using this command:
 
 
 
 
 
 
 
 
 
102
 
103
  ```bash
104
- docker run -p 5000:5000 objectdetection:latest
105
  ```
106
 
107
- This will start the application on port 5000. Open your browser and go to `http://localhost:5000` to access the FastAPI interface.
108
 
109
  ### 3. **Demo**
110
 
111
- You can try the demo directly online through Hugging Face's Spaces:
112
 
113
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  ## Using the API
116
 
117
- You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
118
 
119
- **Endpoint**: `/detect`
120
 
121
- **POST**: `/detect`
122
 
123
- **Parameters**:
124
 
125
- - `file`: (optional) Image file (must be of type `image/*`).
126
- - `image_url`: (optional) URL of the image.
127
- - `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
128
 
129
- **Example Request Body**:
130
 
131
- ```json
132
- {
133
- "image_url": "https://example.com/image.jpg",
134
- "model_name": "facebook/detr-resnet-50"
135
- }
136
  ```
137
 
138
- **Response**:
 
 
 
 
 
 
 
 
 
 
 
139
 
140
- The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  ```json
143
  {
@@ -149,14 +213,20 @@ The response includes a base64-encoded image with detections, detected objects,
149
  }
150
  ```
151
 
 
 
 
 
 
 
152
  ## Development Setup
153
 
154
- If you'd like to contribute or modify the application:
155
 
156
  1. Clone the repository:
157
 
158
  ```bash
159
- git clone https://github.com/NeerajCodz/ObjectDetection.git
160
  cd ObjectDetection
161
  ```
162
 
@@ -166,20 +236,37 @@ cd ObjectDetection
166
  pip install -r requirements.txt
167
  ```
168
 
169
- 3. Run the FastAPI server or Gradio interface:
170
 
171
  ```bash
172
- uvicorn objectdetection:app --reload
173
  ```
174
 
175
- or
176
 
177
  ```bash
178
- python app.py
179
  ```
180
 
181
- 4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
182
 
183
  ## Contributing
184
 
185
- Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # 🚀 Object Detection with Transformer Models
2
 
3
+ This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
4
 
5
+ Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
+ The application supports the following models, each tailored for specific detection or segmentation tasks:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
+ - `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
13
+ - `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
14
+ - `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
15
+ - `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
+ - `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
19
+ - `hustvl/yolos-base`: Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
+ - **Image Upload**: Upload images via the Gradio interface for object detection.
24
+ - **URL Input**: Provide image URLs for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
+ - **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
27
+ - **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
28
+ - **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
29
+ - **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
30
+ - **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
31
 
32
  ## How to Use
33
 
34
+ ### 1. **Local Setup (Git Clone)**
35
 
36
  Follow these steps to set up the application locally:
37
 
38
  #### Prerequisites
39
 
40
  - Python 3.8 or higher
41
+ - `pip` for installing dependencies
42
+ - Git for cloning the repository
43
 
44
  #### Clone the Repository
45
 
46
  ```bash
47
+ git clone https://github.com/NeerajCodz/ObjectDetection
48
  cd ObjectDetection
49
  ```
50
 
51
  #### Install Dependencies
52
 
53
+ Install required packages from `requirements.txt`:
54
 
55
  ```bash
56
  pip install -r requirements.txt
 
58
 
59
  #### Run the Application
60
 
61
+ Launch the Gradio interface:
62
 
63
  ```bash
64
+ python app.py
65
  ```
66
 
67
+ To enable the FastAPI server:
68
 
69
  ```bash
70
+ python app.py --enable-fastapi
71
  ```
72
 
73
  #### Access the Application
74
 
75
+ - **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
76
+ - **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
77
 
78
  ### 2. **Running with Docker**
79
 
80
+ Use Docker for a containerized setup.
81
 
82
  #### Prerequisites
83
 
84
+ - Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
85
 
86
+ #### Pull the Docker Image
87
 
88
+ Pull the pre-built image from Docker Hub:
89
 
90
  ```bash
91
+ docker pull neerajcodz/objectdetection:latest
 
92
  ```
93
 
94
+ #### Run the Docker Container
95
+
96
+ Run the application on port 8080:
97
 
98
  ```bash
99
+ docker run -d -p 8080:80 neerajcodz/objectdetection:latest
100
  ```
101
 
102
+ Access the interface at `http://localhost:8080`.
103
+
104
+ #### Build and Run the Docker Image
105
 
106
+ To build the Docker image locally:
107
+
108
+ 1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
109
+ 2. Build the image:
110
+
111
+ ```bash
112
+ docker build -t objectdetection:local .
113
+ ```
114
+
115
+ 3. Run the container:
116
 
117
  ```bash
118
+ docker run -d -p 8080:80 objectdetection:local
119
  ```
120
 
121
+ Access the interface at `http://localhost:8080`.
122
 
123
  ### 3. **Demo**
124
 
125
+ Try the demo on Hugging Face Spaces:
126
 
127
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
128
 
129
+ ## Command-Line Arguments
130
+
131
+ The `app.py` script supports the following command-line arguments:
132
+
133
+ - `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
134
+ - Example: `python app.py --gradio-port 7870`
135
+ - `--enable-fastapi`: Enable the FastAPI server (disabled by default).
136
+ - Example: `python app.py --enable-fastapi`
137
+ - `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
138
+ - Example: `python app.py --enable-fastapi --fastapi-port 8001`
139
+
140
+ You can combine arguments:
141
+
142
+ ```bash
143
+ python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
144
+ ```
145
+
146
+ Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
147
+
148
+ ```bash
149
+ export GRADIO_SERVER_PORT=7870
150
+ python app.py
151
+ ```
152
+
153
  ## Using the API
154
 
155
+ **Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
156
 
157
+ The `/detect` endpoint allows programmatic image processing.
158
 
159
+ ### Running the FastAPI Server
160
 
161
+ Enable FastAPI when launching the script:
162
 
163
+ ```bash
164
+ python app.py --enable-fastapi
165
+ ```
166
 
167
+ Or run FastAPI separately with Uvicorn:
168
 
169
+ ```bash
170
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
 
 
 
171
  ```
172
 
173
+ Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
174
+
175
+ ### Endpoint Details
176
+
177
+ - **Endpoint**: `POST /detect`
178
+ - **Parameters**:
179
+ - `file`: (optional) Image file (must be `image/*` type).
180
+ - `image_url`: (optional) URL of the image.
181
+ - `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
182
+ - **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
183
+
184
+ ### Example Requests
185
 
186
+ #### Using `curl` with an Image URL
187
+
188
+ ```bash
189
+ curl -X POST "http://localhost:8000/detect" \
190
+ -H "Content-Type: application/json" \
191
+ -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
192
+ ```
193
+
194
+ #### Using `curl` with an Image File
195
+
196
+ ```bash
197
+ curl -X POST "http://localhost:8000/detect" \
198
+ -F "file=@/path/to/image.jpg" \
199
+ -F "model_name=facebook/detr-resnet-50"
200
+ ```
201
+
202
+ ### Response Format
203
+
204
+ The response includes a base64-encoded image with detections and detection details:
205
 
206
  ```json
207
  {
 
213
  }
214
  ```
215
 
216
+ ### Notes
217
+
218
+ - Ensure only one of `file` or `image_url` is provided.
219
+ - The API may experience instability with panoptic models; use object detection models for reliability.
220
+ - Test the API using the Swagger UI for easier debugging.
221
+
222
  ## Development Setup
223
 
224
+ To contribute or modify the application:
225
 
226
  1. Clone the repository:
227
 
228
  ```bash
229
+ git clone https://github.com/NeerajCodz/ObjectDetection
230
  cd ObjectDetection
231
  ```
232
 
 
236
  pip install -r requirements.txt
237
  ```
238
 
239
+ 3. Run the application:
240
 
241
  ```bash
242
+ python app.py
243
  ```
244
 
245
+ Or run FastAPI:
246
 
247
  ```bash
248
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
249
  ```
250
 
251
+ 4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
252
 
253
  ## Contributing
254
 
255
+ Contributions are welcome! To contribute:
256
+
257
+ 1. Fork the repository.
258
+ 2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
259
+ 3. Commit changes (`git commit -m "Add your feature"`).
260
+ 4. Push to the branch (`git push origin feature/your-feature`).
261
+ 5. Open a pull request on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
262
+
263
+ Please include tests and documentation for new features. Report issues via GitHub Issues.
264
+
265
+ ## Troubleshooting
266
+
267
+ - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
268
+ - **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab.
269
+ - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
270
+ - **API Instability**: Test with smaller images and object detection models first.
271
+
272
+ For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
hf_space/hf_space/README.md CHANGED
@@ -1,12 +1,185 @@
1
- ---
2
- title: ObjectDetection
3
- emoji: 🦀
4
- colorFrom: green
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 5.29.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Object Detection with Transformer Models
2
+
3
+ This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
4
+
5
+ You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
+
7
+ ## Models Supported
8
+
9
+ The following models are supported, as defined in the application:
10
+
11
+ - **DETR (DEtection TRansformer)**:
12
+ - `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
13
+ - `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
14
+ - `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
15
+ - `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
16
+
17
+ - **YOLOS (You Only Look One-level Series)**:
18
+ - `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
19
+ - `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
20
+
21
+ ## Features
22
+
23
+ - **Image Upload**: Upload images from your device for object detection via the Gradio interface.
24
+ - **URL Input**: Input an image URL for detection through the Gradio interface or API.
25
+ - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
+ - **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
27
+ - **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
28
+ - **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
29
+ - **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
30
+
31
+ ## How to Use
32
+
33
+ ### 1. **Normal Git Clone Method**
34
+
35
+ Follow these steps to set up the application locally:
36
+
37
+ #### Prerequisites
38
+
39
+ - Python 3.8 or higher
40
+ - Install dependencies using `pip`
41
+
42
+ #### Clone the Repository
43
+
44
+ ```bash
45
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
46
+ cd ObjectDetection
47
+ ```
48
+
49
+ #### Install Dependencies
50
+
51
+ Install the required dependencies from `requirements.txt`:
52
+
53
+ ```bash
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ #### Run the Application
58
+
59
+ Start the FastAPI server using uvicorn:
60
+
61
+ ```bash
62
+ uvicorn objectdetection:app --reload
63
+ ```
64
+
65
+ Alternatively, launch the Gradio interface by running the main script:
66
+
67
+ ```bash
68
+ python app.py
69
+ ```
70
+
71
+ #### Access the Application
72
+
73
+ - For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
74
+ - For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
75
+
76
+ ### 2. **Running with Docker**
77
+
78
+ If you prefer to use Docker to set up and run the application, follow these steps:
79
+
80
+ #### Prerequisites
81
+
82
+ - Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
83
+
84
+ #### Build the Docker Image
85
+
86
+ First, clone the repository (if you haven't already):
87
+
88
+ ```bash
89
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
90
+ cd ObjectDetection
91
+ ```
92
+
93
+ Now, build the Docker image:
94
+
95
+ ```bash
96
+ docker build -t objectdetection:latest .
97
+ ```
98
+
99
+ #### Run the Docker Container
100
+
101
+ Once the image is built, run the application using this command:
102
+
103
+ ```bash
104
+ docker run -p 5000:5000 objectdetection:latest
105
+ ```
106
+
107
+ This will start the application on port 5000. Open your browser and go to `http://localhost:5000` to access the FastAPI interface.
108
+
109
+ ### 3. **Demo**
110
+
111
+ You can try the demo directly online through Hugging Face's Spaces:
112
+
113
+ [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
114
+
115
+ ## Using the API
116
+
117
+ You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
118
+
119
+ **Endpoint**: `/detect`
120
+
121
+ **POST**: `/detect`
122
+
123
+ **Parameters**:
124
+
125
+ - `file`: (optional) Image file (must be of type `image/*`).
126
+ - `image_url`: (optional) URL of the image.
127
+ - `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
128
+
129
+ **Example Request Body**:
130
+
131
+ ```json
132
+ {
133
+ "image_url": "https://example.com/image.jpg",
134
+ "model_name": "facebook/detr-resnet-50"
135
+ }
136
+ ```
137
+
138
+ **Response**:
139
+
140
+ The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
141
+
142
+ ```json
143
+ {
144
+ "image_url": "data:image/png;base64,...",
145
+ "detected_objects": ["person", "car"],
146
+ "confidence_scores": [0.95, 0.87],
147
+ "unique_objects": ["person", "car"],
148
+ "unique_confidence_scores": [0.95, 0.87]
149
+ }
150
+ ```
151
+
152
+ ## Development Setup
153
+
154
+ If you'd like to contribute or modify the application:
155
+
156
+ 1. Clone the repository:
157
+
158
+ ```bash
159
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
160
+ cd ObjectDetection
161
+ ```
162
+
163
+ 2. Install dependencies:
164
+
165
+ ```bash
166
+ pip install -r requirements.txt
167
+ ```
168
+
169
+ 3. Run the FastAPI server or Gradio interface:
170
+
171
+ ```bash
172
+ uvicorn objectdetection:app --reload
173
+ ```
174
+
175
+ or
176
+
177
+ ```bash
178
+ python app.py
179
+ ```
180
+
181
+ 4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
182
+
183
+ ## Contributing
184
+
185
+ Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
hf_space/hf_space/hf_space/hf_space/hf_space/.huggingface.yaml ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ sdk: gradio
2
+ python_version: 3.10
3
+ app_file: app.py
4
+ title: Object Detection App
5
+ subtitle: Real-time object detection in images using Gradio
6
+ hardware: cpu-basic
7
+ license: mit
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/docker-build-push.yml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Build and Push Docker Image to Docker Hub
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ jobs:
9
+ build-and-push:
10
+ runs-on: ubuntu-latest
11
+ steps:
12
+ - name: Checkout code
13
+ uses: actions/checkout@v4
14
+
15
+ - name: Log in to Docker Hub
16
+ uses: docker/login-action@v3
17
+ with:
18
+ username: ${{ secrets.DOCKER_USERNAME }}
19
+ password: ${{ secrets.DOCKER_PAT }}
20
+
21
+ - name: Build and push Docker image
22
+ uses: docker/build-push-action@v6
23
+ with:
24
+ context: .
25
+ push: true
26
+ tags: ${{ secrets.DOCKER_USERNAME }}/objectdetection:latest
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/hf-space-sync.yml ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Sync to Hugging Face Space
2
+
3
+ on:
4
+ push:
5
+ branches: [ main ]
6
+
7
+ jobs:
8
+ deploy-to-hf-space:
9
+ runs-on: ubuntu-latest
10
+
11
+ steps:
12
+ - name: Checkout Repository
13
+ uses: actions/checkout@v3
14
+
15
+ - name: Install Git
16
+ run: sudo apt-get install git
17
+
18
+ - name: Push to Hugging Face Space
19
+ env:
20
+ HF_TOKEN: ${{ secrets.HF_TOKEN }}
21
+ HF_USERNAME: ${{ secrets.HF_USERNAME }}
22
+ EMAIL: ${{ secrets.EMAIL }}
23
+ run: |
24
+ git config --global user.email $EMAIL
25
+ git config --global user.name $HF_USERNAME
26
+
27
+ git clone https://$HF_USERNAME:[email protected]/spaces/$HF_USERNAME/ObjectDetection hf_space
28
+ rsync -av --exclude='.git' ./ hf_space/
29
+ cd hf_space
30
+ git add .
31
+ if git diff --cached --quiet; then
32
+ echo "✅ No changes to commit."
33
+ else
34
+ git commit -m "Sync from GitHub"
35
+ git push
36
+ fi
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ __pycache__/
2
+ venv/
3
+ *.pyc
4
+ .DS_Store
5
+ .env
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/Dockerfile ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY requirements.txt .
6
+
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ COPY app.py .
10
+
11
+ EXPOSE 5000
12
+
13
+ CMD ["python", "app.py"]
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Neeraj Sathish Kumar
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py ADDED
@@ -0,0 +1,384 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import torch
3
+ from transformers import DetrImageProcessor, DetrForObjectDetection
4
+ from transformers import YolosImageProcessor, YolosForObjectDetection
5
+ from transformers import DetrForSegmentation
6
+ from PIL import Image, ImageDraw, ImageStat
7
+ import requests
8
+ from io import BytesIO
9
+ import base64
10
+ from collections import Counter
11
+ import logging
12
+ from fastapi import FastAPI, File, UploadFile, HTTPException, Form
13
+ from fastapi.responses import JSONResponse
14
+ import uvicorn
15
+ import pandas as pd
16
+ import traceback
17
+ import os
18
+
19
+ # Set up logging
20
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
21
+ logger = logging.getLogger(__name__)
22
+
23
+ # Constants
24
+ CONFIDENCE_THRESHOLD = 0.5
25
+ VALID_MODELS = [
26
+ "facebook/detr-resnet-50",
27
+ "facebook/detr-resnet-101",
28
+ "facebook/detr-resnet-50-panoptic",
29
+ "facebook/detr-resnet-101-panoptic",
30
+ "hustvl/yolos-tiny",
31
+ "hustvl/yolos-base"
32
+ ]
33
+ MODEL_DESCRIPTIONS = {
34
+ "facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
35
+ "facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
36
+ "facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
37
+ "facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
38
+ "hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
39
+ "hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
40
+ }
41
+
42
+ # Lazy model loading
43
+ models = {}
44
+ processors = {}
45
+
46
+ def process(image, model_name):
47
+ """Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
48
+ try:
49
+ if model_name not in VALID_MODELS:
50
+ raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
51
+
52
+ # Load model and processor
53
+ if model_name not in models:
54
+ logger.info(f"Loading model: {model_name}")
55
+ if "yolos" in model_name:
56
+ models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
57
+ processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
58
+ elif "panoptic" in model_name:
59
+ models[model_name] = DetrForSegmentation.from_pretrained(model_name)
60
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
61
+ else:
62
+ models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
63
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
64
+
65
+ model, processor = models[model_name], processors[model_name]
66
+ inputs = processor(images=image, return_tensors="pt")
67
+
68
+ with torch.no_grad():
69
+ outputs = model(**inputs)
70
+
71
+ target_sizes = torch.tensor([image.size[::-1]])
72
+ draw = ImageDraw.Draw(image)
73
+ object_names = []
74
+ confidence_scores = []
75
+ object_counter = Counter()
76
+
77
+ if "panoptic" in model_name:
78
+ processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
79
+ results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
80
+
81
+ for segment in results["segments_info"]:
82
+ label = segment["label_id"]
83
+ label_name = model.config.id2label.get(label, "Unknown")
84
+ score = segment.get("score", 1.0)
85
+
86
+ if "masks" in results and segment["id"] < len(results["masks"]):
87
+ mask = results["masks"][segment["id"]].cpu().numpy()
88
+ if mask.shape[0] > 0 and mask.shape[1] > 0:
89
+ mask_image = Image.fromarray((mask * 255).astype("uint8"))
90
+ colored_mask = Image.new("RGBA", image.size, (0, 0, 0, 0))
91
+ mask_draw = ImageDraw.Draw(colored_mask)
92
+ r, g, b = (segment["id"] * 50) % 255, (segment["id"] * 100) % 255, (segment["id"] * 150) % 255
93
+ mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
94
+ image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
95
+ draw = ImageDraw.Draw(image)
96
+
97
+ if score > CONFIDENCE_THRESHOLD:
98
+ object_names.append(label_name)
99
+ confidence_scores.append(float(score))
100
+ object_counter[label_name] = float(score)
101
+ else:
102
+ results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
103
+
104
+ for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
105
+ if score > CONFIDENCE_THRESHOLD:
106
+ x, y, x2, y2 = box.tolist()
107
+ draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
108
+ label_name = model.config.id2label.get(label.item(), "Unknown")
109
+ # Place text at top-right corner, outside the box, with smaller size
110
+ text = f"{label_name}: {score:.2f}"
111
+ text_bbox = draw.textbbox((0, 0), text)
112
+ text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
113
+ draw.text((x2 - text_width - 2, y - text_height - 2), text, fill="#32CD32")
114
+ object_names.append(label_name)
115
+ confidence_scores.append(float(score))
116
+ object_counter[label_name] = float(score)
117
+
118
+ unique_objects = list(object_counter.keys())
119
+ unique_confidences = [object_counter[obj] for obj in unique_objects]
120
+
121
+ # Image properties
122
+ file_size = "Unknown"
123
+ if hasattr(image, "fp") and image.fp is not None:
124
+ buffered = BytesIO()
125
+ image.save(buffered, format="PNG")
126
+ file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
127
+
128
+ # Color statistics
129
+ try:
130
+ stat = ImageStat.Stat(image)
131
+ color_stats = {
132
+ "mean": [f"{m:.2f}" for m in stat.mean],
133
+ "stddev": [f"{s:.2f}" for s in stat.stddev]
134
+ }
135
+ except Exception as e:
136
+ logger.error(f"Error calculating color statistics: {str(e)}")
137
+ color_stats = {"mean": "Error", "stddev": "Error"}
138
+
139
+ properties = {
140
+ "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
141
+ "Size": f"{image.width}x{image.height}",
142
+ "Width": f"{image.width} px",
143
+ "Height": f"{image.height} px",
144
+ "Mode": image.mode,
145
+ "Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
146
+ "File Size": file_size,
147
+ "Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
148
+ "StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
149
+ }
150
+
151
+ return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
152
+ except Exception as e:
153
+ logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
154
+ raise
155
+
156
+ # FastAPI Setup
157
+ app = FastAPI(title="Object Detection API")
158
+
159
+ @app.post("/detect")
160
+ async def detect_objects_endpoint(
161
+ file: UploadFile = File(None),
162
+ image_url: str = Form(None),
163
+ model_name: str = Form(VALID_MODELS[0])
164
+ ):
165
+ """FastAPI endpoint to detect objects in an image from file or URL."""
166
+ try:
167
+ if (file is None and not image_url) or (file is not None and image_url):
168
+ raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
169
+
170
+ if file:
171
+ if not file.content_type.startswith("image/"):
172
+ raise HTTPException(status_code=400, detail="File must be an image")
173
+ contents = await file.read()
174
+ image = Image.open(BytesIO(contents)).convert("RGB")
175
+ else:
176
+ response = requests.get(image_url, timeout=10)
177
+ response.raise_for_status()
178
+ image = Image.open(BytesIO(response.content)).convert("RGB")
179
+
180
+ if model_name not in VALID_MODELS:
181
+ raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
182
+
183
+ detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
184
+
185
+ buffered = BytesIO()
186
+ detected_image.save(buffered, format="PNG")
187
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
188
+ img_url = f"data:image/png;base64,{img_base64}"
189
+
190
+ return JSONResponse(content={
191
+ "image_url": img_url,
192
+ "detected_objects": detected_objects,
193
+ "confidence_scores": detected_confidences,
194
+ "unique_objects": unique_objects,
195
+ "unique_confidence_scores": unique_confidences
196
+ })
197
+ except Exception as e:
198
+ logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
199
+ raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
200
+
201
+ # Gradio UI
202
+ def create_gradio_ui():
203
+ with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
204
+ gr.Markdown(
205
+ """
206
+ # 🚀 Object Detection App
207
+ Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
208
+ """
209
+ )
210
+
211
+ with gr.Tabs():
212
+ with gr.Tab("📷 Image Upload"):
213
+ with gr.Row():
214
+ with gr.Column(scale=1):
215
+ gr.Markdown("### Input")
216
+ model_choice = gr.Dropdown(
217
+ choices=VALID_MODELS,
218
+ value=VALID_MODELS[0],
219
+ label="🔎 Select Model",
220
+ info="Choose a model for object detection or panoptic segmentation."
221
+ )
222
+ model_info = gr.Markdown(
223
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
224
+ visible=True
225
+ )
226
+ image_input = gr.Image(type="pil", label="📷 Upload Image")
227
+ image_url_input = gr.Textbox(
228
+ label="🔗 Image URL",
229
+ placeholder="https://example.com/image.jpg"
230
+ )
231
+ with gr.Row():
232
+ submit_btn = gr.Button("✨ Detect", variant="primary")
233
+ clear_btn = gr.Button("🗑️ Clear", variant="secondary")
234
+
235
+ model_choice.change(
236
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
237
+ inputs=model_choice,
238
+ outputs=model_info
239
+ )
240
+
241
+ with gr.Column(scale=2):
242
+ gr.Markdown("### Results")
243
+ error_output = gr.Textbox(
244
+ label="⚠️ Errors",
245
+ visible=False,
246
+ lines=3,
247
+ max_lines=5
248
+ )
249
+ output_image = gr.Image(
250
+ type="pil",
251
+ label="🎯 Detected Image",
252
+ interactive=False
253
+ )
254
+ with gr.Row():
255
+ objects_output = gr.DataFrame(
256
+ label="📋 Detected Objects",
257
+ interactive=False,
258
+ value=None
259
+ )
260
+ unique_objects_output = gr.DataFrame(
261
+ label="🔍 Unique Objects",
262
+ interactive=False,
263
+ value=None
264
+ )
265
+ properties_output = gr.DataFrame(
266
+ label="📄 Image Properties",
267
+ interactive=False,
268
+ value=None
269
+ )
270
+
271
+ def process_for_gradio(image, url, model_name):
272
+ try:
273
+ if image is None and not url:
274
+ return None, None, None, None, "Please provide an image or URL"
275
+ if image and url:
276
+ return None, None, None, None, "Please provide either an image or URL, not both"
277
+
278
+ if url:
279
+ response = requests.get(url, timeout=10)
280
+ response.raise_for_status()
281
+ image = Image.open(BytesIO(response.content)).convert("RGB")
282
+
283
+ detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
284
+ objects_df = pd.DataFrame({
285
+ "Object": objects,
286
+ "Confidence Score": [f"{score:.2f}" for score in scores]
287
+ }) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
288
+ unique_objects_df = pd.DataFrame({
289
+ "Unique Object": unique_objects,
290
+ "Confidence Score": [f"{score:.2f}" for score in unique_scores]
291
+ }) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
292
+ properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
293
+ return detected_image, objects_df, unique_objects_df, properties_df, ""
294
+ except Exception as e:
295
+ error_msg = f"Error processing image: {str(e)}"
296
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
297
+ return None, None, None, None, error_msg
298
+
299
+ submit_btn.click(
300
+ fn=process_for_gradio,
301
+ inputs=[image_input, image_url_input, model_choice],
302
+ outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
303
+ )
304
+
305
+ clear_btn.click(
306
+ fn=lambda: [None, "", None, None, None, None],
307
+ inputs=None,
308
+ outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
309
+ )
310
+
311
+ with gr.Tab("🔗 URL Input"):
312
+ gr.Markdown("### Process Image from URL")
313
+ image_url_input = gr.Textbox(
314
+ label="🔗 Image URL",
315
+ placeholder="https://example.com/image.jpg"
316
+ )
317
+ url_model_choice = gr.Dropdown(
318
+ choices=VALID_MODELS,
319
+ value=VALID_MODELS[0],
320
+ label="🔎 Select Model"
321
+ )
322
+ url_model_info = gr.Markdown(
323
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
324
+ visible=True
325
+ )
326
+ url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
327
+ url_output = gr.JSON(label="API Response")
328
+
329
+ url_model_choice.change(
330
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
331
+ inputs=url_model_choice,
332
+ outputs=url_model_info
333
+ )
334
+
335
+ def process_url_for_gradio(url, model_name):
336
+ try:
337
+ response = requests.get(url, timeout=10)
338
+ response.raise_for_status()
339
+ image = Image.open(BytesIO(response.content)).convert("RGB")
340
+ detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
341
+ buffered = BytesIO()
342
+ detected_image.save(buffered, format="PNG")
343
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
344
+ return {
345
+ "image_url": f"data:image/png;base64,{img_base64}",
346
+ "detected_objects": objects,
347
+ "confidence_scores": scores,
348
+ "unique_objects": unique_objects,
349
+ "unique_confidence_scores": unique_scores
350
+ }
351
+ except Exception as e:
352
+ error_msg = f"Error processing URL: {str(e)}"
353
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
354
+ return {"error": error_msg}
355
+
356
+ url_submit_btn.click(
357
+ fn=process_url_for_gradio,
358
+ inputs=[image_url_input, url_model_choice],
359
+ outputs=[url_output]
360
+ )
361
+
362
+ with gr.Tab("ℹ️ Help"):
363
+ gr.Markdown(
364
+ """
365
+ ## How to Use
366
+ - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
367
+ - **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
368
+ - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
369
+ - **Clear**: Reset all inputs and outputs using the "Clear" button.
370
+ - **Errors**: Check the error box for any processing issues.
371
+
372
+ ## Tips
373
+ - Use high-quality images for better detection results.
374
+ - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
375
+ - For faster processing, try YOLOS-Tiny on resource-constrained devices.
376
+ """
377
+ )
378
+
379
+ return demo
380
+
381
+ if __name__ == "__main__":
382
+ demo = create_gradio_ui()
383
+ demo.launch()
384
+ # To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ObjectDetection
3
+ emoji: 🦀
4
+ colorFrom: green
5
+ colorTo: yellow
6
+ sdk: gradio
7
+ sdk_version: 5.29.0
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ transformers
2
+ torch
3
+ tensorflow
4
+ gradio
5
+ pillow
6
+ timm
7
+ fastapi
8
+ requests