Spaces:
Running
Running
Commit
·
d3a3e0d
1
Parent(s):
e581bf6
Sync from GitHub
Browse files- README.md +12 -8
- app.py +235 -435
- hf_space/app.py +626 -268
- hf_space/hf_space/README.md +148 -61
- hf_space/hf_space/hf_space/README.md +185 -12
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.huggingface.yaml +7 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/docker-build-push.yml +26 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/hf-space-sync.yml +36 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore +5 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/Dockerfile +13 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/LICENSE +21 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py +384 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitattributes +35 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md +12 -0
- hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/requirements.txt +8 -0
- requirements.txt +4 -1
README.md
CHANGED
@@ -44,7 +44,7 @@ Follow these steps to set up the application locally:
|
|
44 |
#### Clone the Repository
|
45 |
|
46 |
```bash
|
47 |
-
git clone https://github.com/NeerajCodz/ObjectDetection
|
48 |
cd ObjectDetection
|
49 |
```
|
50 |
|
@@ -136,11 +136,13 @@ The `app.py` script supports the following command-line arguments:
|
|
136 |
- Example: `python app.py --enable-fastapi`
|
137 |
- `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
|
138 |
- Example: `python app.py --enable-fastapi --fastapi-port 8001`
|
|
|
|
|
139 |
|
140 |
You can combine arguments:
|
141 |
|
142 |
```bash
|
143 |
-
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
|
144 |
```
|
145 |
|
146 |
Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
|
@@ -186,16 +188,16 @@ Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
|
|
186 |
#### Using `curl` with an Image URL
|
187 |
|
188 |
```bash
|
189 |
-
curl -X POST "http://localhost:8000/detect"
|
190 |
-
-H "Content-Type: application/json"
|
191 |
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
|
192 |
```
|
193 |
|
194 |
#### Using `curl` with an Image File
|
195 |
|
196 |
```bash
|
197 |
-
curl -X POST "http://localhost:8000/detect"
|
198 |
-
-F "file=@/path/to/image.jpg"
|
199 |
-F "model_name=facebook/detr-resnet-50"
|
200 |
```
|
201 |
|
@@ -226,7 +228,7 @@ To contribute or modify the application:
|
|
226 |
1. Clone the repository:
|
227 |
|
228 |
```bash
|
229 |
-
git clone https://github.com/NeerajCodz/ObjectDetection
|
230 |
cd ObjectDetection
|
231 |
```
|
232 |
|
@@ -265,8 +267,10 @@ Please include tests and documentation for new features. Report issues via GitHu
|
|
265 |
## Troubleshooting
|
266 |
|
267 |
- **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
|
268 |
-
-
|
|
|
269 |
- **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
|
270 |
- **API Instability**: Test with smaller images and object detection models first.
|
|
|
271 |
|
272 |
For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
|
|
|
44 |
#### Clone the Repository
|
45 |
|
46 |
```bash
|
47 |
+
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
48 |
cd ObjectDetection
|
49 |
```
|
50 |
|
|
|
136 |
- Example: `python app.py --enable-fastapi`
|
137 |
- `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
|
138 |
- Example: `python app.py --enable-fastapi --fastapi-port 8001`
|
139 |
+
- `--confidence-threshold <float-value)`: Confidence threshold for detection (Range: 0 - 1) (default: 8000).
|
140 |
+
- Example: `python app.py --confidence-threshold 0.75`
|
141 |
|
142 |
You can combine arguments:
|
143 |
|
144 |
```bash
|
145 |
+
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75
|
146 |
```
|
147 |
|
148 |
Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
|
|
|
188 |
#### Using `curl` with an Image URL
|
189 |
|
190 |
```bash
|
191 |
+
curl -X POST "http://localhost:8000/detect" \\
|
192 |
+
-H "Content-Type: application/json" \\
|
193 |
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
|
194 |
```
|
195 |
|
196 |
#### Using `curl` with an Image File
|
197 |
|
198 |
```bash
|
199 |
+
curl -X POST "http://localhost:8000/detect" \\
|
200 |
+
-F "file=@/path/to/image.jpg" \\
|
201 |
-F "model_name=facebook/detr-resnet-50"
|
202 |
```
|
203 |
|
|
|
228 |
1. Clone the repository:
|
229 |
|
230 |
```bash
|
231 |
+
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
232 |
cd ObjectDetection
|
233 |
```
|
234 |
|
|
|
267 |
## Troubleshooting
|
268 |
|
269 |
- **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
|
270 |
+
- Example: `python app.py --gradio-port 7870`
|
271 |
+
- **Colab Asyncio Error**: If you encounter `RuntimeError: asyncio.run() cannot be called from a running event loop` in Colab, the application now uses `nest_asyncio` to handle this. Ensure `nest_asyncio` is installed (`pip install nest_asyncio`).
|
272 |
- **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
|
273 |
- **API Instability**: Test with smaller images and object detection models first.
|
274 |
+
- **FastAPI Not Starting**: Ensure `--enable-fastapi` is used, and check that the specified `--fastapi-port` (default: 8000) is available.
|
275 |
|
276 |
For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
|
app.py
CHANGED
@@ -3,11 +3,10 @@ import base64
|
|
3 |
import logging
|
4 |
import os
|
5 |
import sys
|
6 |
-
import traceback
|
7 |
import threading
|
8 |
from collections import Counter
|
9 |
from io import BytesIO
|
10 |
-
from typing import Dict, List, Optional, Tuple
|
11 |
|
12 |
import gradio as gr
|
13 |
import pandas as pd
|
@@ -30,15 +29,12 @@ import nest_asyncio
|
|
30 |
# Configuration
|
31 |
# ------------------------------
|
32 |
|
33 |
-
#
|
34 |
-
logging.basicConfig(
|
35 |
-
level=logging.INFO,
|
36 |
-
format="%(asctime)s - %(levelname)s - %(message)s",
|
37 |
-
)
|
38 |
logger = logging.getLogger(__name__)
|
39 |
|
40 |
-
#
|
41 |
-
CONFIDENCE_THRESHOLD: float = 0.5
|
42 |
VALID_MODELS: List[str] = [
|
43 |
"facebook/detr-resnet-50",
|
44 |
"facebook/detr-resnet-101",
|
@@ -48,128 +44,109 @@ VALID_MODELS: List[str] = [
|
|
48 |
"hustvl/yolos-base",
|
49 |
]
|
50 |
MODEL_DESCRIPTIONS: Dict[str, str] = {
|
51 |
-
"facebook/detr-resnet-50":
|
52 |
-
|
53 |
-
|
54 |
-
"facebook/detr-resnet-101":
|
55 |
-
|
56 |
-
|
57 |
-
"facebook/detr-resnet-50-panoptic": (
|
58 |
-
"DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes."
|
59 |
-
),
|
60 |
-
"facebook/detr-resnet-101-panoptic": (
|
61 |
-
"DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes."
|
62 |
-
),
|
63 |
-
"hustvl/yolos-tiny": (
|
64 |
-
"YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments."
|
65 |
-
),
|
66 |
-
"hustvl/yolos-base": (
|
67 |
-
"YOLOS Base model. Balances speed and accuracy for object detection."
|
68 |
-
),
|
69 |
}
|
70 |
-
|
71 |
-
#
|
72 |
-
|
73 |
-
|
74 |
-
PORT_RANGE: range = range(7860, 7870) # Try ports 7860-7869
|
75 |
-
MAX_PORT_ATTEMPTS: int = 10
|
76 |
|
77 |
# Thread-safe storage for lazy-loaded models and processors
|
78 |
models: Dict[str, any] = {}
|
79 |
processors: Dict[str, any] = {}
|
80 |
model_lock = threading.Lock()
|
81 |
|
82 |
-
# ------------------------------
|
83 |
-
# Model Loading
|
84 |
-
# ------------------------------
|
85 |
-
|
86 |
-
def load_model_and_processor(model_name: str) -> Tuple[any, any]:
|
87 |
-
"""
|
88 |
-
Load and cache the specified model and processor thread-safely.
|
89 |
-
|
90 |
-
Args:
|
91 |
-
model_name: Name of the model to load (must be in VALID_MODELS).
|
92 |
-
|
93 |
-
Returns:
|
94 |
-
Tuple containing the loaded model and processor.
|
95 |
-
|
96 |
-
Raises:
|
97 |
-
ValueError: If the model_name is invalid or loading fails.
|
98 |
-
"""
|
99 |
-
with model_lock:
|
100 |
-
if model_name not in models:
|
101 |
-
logger.info(f"Loading model: {model_name}")
|
102 |
-
try:
|
103 |
-
if "yolos" in model_name:
|
104 |
-
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
105 |
-
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
106 |
-
elif "panoptic" in model_name:
|
107 |
-
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
108 |
-
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
109 |
-
else:
|
110 |
-
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
111 |
-
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
112 |
-
logger.debug(f"Model {model_name} loaded successfully")
|
113 |
-
except Exception as e:
|
114 |
-
logger.error(f"Failed to load model {model_name}: {str(e)}")
|
115 |
-
raise ValueError(f"Failed to load model: {str(e)}")
|
116 |
-
return models[model_name], processors[model_name]
|
117 |
-
|
118 |
# ------------------------------
|
119 |
# Image Processing
|
120 |
# ------------------------------
|
121 |
|
122 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
"""
|
124 |
-
Process an image for object detection or panoptic segmentation.
|
125 |
|
126 |
Args:
|
127 |
-
image: PIL Image
|
|
|
128 |
model_name: Name of the model to use (must be in VALID_MODELS).
|
|
|
|
|
129 |
|
130 |
Returns:
|
131 |
-
|
132 |
-
|
133 |
-
- List of detected object names.
|
134 |
-
- List of confidence scores for detected objects.
|
135 |
-
- List of unique object names.
|
136 |
-
- List of confidence scores for unique objects.
|
137 |
-
- Dictionary of image properties (format, size, etc.).
|
138 |
-
|
139 |
-
Raises:
|
140 |
-
ValueError: If the model_name is invalid.
|
141 |
-
RuntimeError: If processing fails due to model or image issues.
|
142 |
"""
|
143 |
-
if model_name not in VALID_MODELS:
|
144 |
-
raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
|
145 |
-
|
146 |
try:
|
147 |
-
#
|
148 |
-
|
149 |
-
|
|
|
|
|
|
|
|
|
|
|
150 |
|
151 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
152 |
inputs = processor(images=image, return_tensors="pt")
|
153 |
with torch.no_grad():
|
154 |
outputs = model(**inputs)
|
155 |
|
156 |
-
# Initialize drawing context
|
157 |
draw = ImageDraw.Draw(image)
|
158 |
object_names: List[str] = []
|
159 |
confidence_scores: List[float] = []
|
160 |
object_counter = Counter()
|
161 |
target_sizes = torch.tensor([image.size[::-1]])
|
162 |
|
163 |
-
# Process panoptic
|
164 |
if "panoptic" in model_name:
|
|
|
165 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
166 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
167 |
-
|
168 |
for segment in results["segments_info"]:
|
169 |
label = segment["label_id"]
|
170 |
label_name = model.config.id2label.get(label, "Unknown")
|
171 |
score = segment.get("score", 1.0)
|
172 |
-
|
173 |
# Apply segmentation mask if available
|
174 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
175 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
@@ -181,67 +158,92 @@ def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str]
|
|
181 |
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
182 |
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
183 |
draw = ImageDraw.Draw(image)
|
184 |
-
|
185 |
-
if score > CONFIDENCE_THRESHOLD:
|
186 |
object_names.append(label_name)
|
187 |
confidence_scores.append(float(score))
|
188 |
object_counter[label_name] = float(score)
|
189 |
else:
|
|
|
190 |
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
191 |
-
|
192 |
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
193 |
-
if score >
|
194 |
x, y, x2, y2 = box.tolist()
|
195 |
-
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
196 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
197 |
text = f"{label_name}: {score:.2f}"
|
198 |
text_bbox = draw.textbbox((0, 0), text)
|
199 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
200 |
-
|
|
|
|
|
|
|
201 |
object_names.append(label_name)
|
202 |
confidence_scores.append(float(score))
|
203 |
object_counter[label_name] = float(score)
|
204 |
|
205 |
-
# Compile unique objects and
|
206 |
unique_objects = list(object_counter.keys())
|
207 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
208 |
|
209 |
-
# Calculate image properties
|
210 |
properties: Dict[str, str] = {
|
211 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
212 |
"Size": f"{image.width}x{image.height}",
|
213 |
"Width": f"{image.width} px",
|
214 |
"Height": f"{image.height} px",
|
215 |
"Mode": image.mode,
|
216 |
-
"Aspect Ratio": (
|
217 |
-
f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined"
|
218 |
-
),
|
219 |
"File Size": "Unknown",
|
220 |
"Mean (R,G,B)": "Unknown",
|
221 |
"StdDev (R,G,B)": "Unknown",
|
222 |
}
|
223 |
-
|
224 |
-
# Compute file size
|
225 |
try:
|
|
|
226 |
buffered = BytesIO()
|
227 |
image.save(buffered, format="PNG")
|
228 |
properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
229 |
-
|
230 |
-
logger.error(f"Error calculating file size: {str(e)}")
|
231 |
-
|
232 |
-
# Compute color statistics
|
233 |
-
try:
|
234 |
stat = ImageStat.Stat(image)
|
235 |
properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
|
236 |
properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
|
237 |
except Exception as e:
|
238 |
-
logger.error(f"Error calculating
|
239 |
|
240 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
241 |
|
|
|
|
|
|
|
|
|
|
|
242 |
except Exception as e:
|
243 |
-
|
244 |
-
|
|
|
|
|
245 |
|
246 |
# ------------------------------
|
247 |
# FastAPI Setup
|
@@ -254,6 +256,7 @@ async def detect_objects_endpoint(
|
|
254 |
file: Optional[UploadFile] = File(None),
|
255 |
image_url: Optional[str] = Form(None),
|
256 |
model_name: str = Form(VALID_MODELS[0]),
|
|
|
257 |
) -> JSONResponse:
|
258 |
"""
|
259 |
FastAPI endpoint to detect objects in an image from file upload or URL.
|
@@ -262,62 +265,35 @@ async def detect_objects_endpoint(
|
|
262 |
file: Uploaded image file (optional).
|
263 |
image_url: URL of the image (optional).
|
264 |
model_name: Model to use for detection (default: first VALID_MODELS entry).
|
|
|
265 |
|
266 |
Returns:
|
267 |
-
JSONResponse
|
268 |
|
269 |
Raises:
|
270 |
-
HTTPException:
|
271 |
"""
|
272 |
try:
|
273 |
-
# Validate input
|
274 |
if (file is None and not image_url) or (file is not None and image_url):
|
275 |
-
raise HTTPException(
|
276 |
-
|
277 |
-
|
278 |
-
)
|
279 |
-
|
280 |
-
|
281 |
if file:
|
282 |
if not file.content_type.startswith("image/"):
|
283 |
raise HTTPException(status_code=400, detail="File must be an image")
|
284 |
contents = await file.read()
|
285 |
image = Image.open(BytesIO(contents)).convert("RGB")
|
286 |
-
|
287 |
-
|
288 |
-
|
289 |
-
|
290 |
-
|
291 |
-
|
292 |
-
|
293 |
-
status_code=400,
|
294 |
-
detail=f"Invalid model. Choose from: {VALID_MODELS}",
|
295 |
-
)
|
296 |
-
|
297 |
-
# Process image
|
298 |
-
detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(
|
299 |
-
image, model_name
|
300 |
-
)
|
301 |
-
|
302 |
-
# Encode image as base64
|
303 |
-
buffered = BytesIO()
|
304 |
-
detected_image.save(buffered, format="PNG")
|
305 |
-
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
306 |
-
img_url = f"data:image/png;base64,{img_base64}"
|
307 |
-
|
308 |
-
return JSONResponse(
|
309 |
-
content={
|
310 |
-
"image_url": img_url,
|
311 |
-
"detected_objects": detected_objects,
|
312 |
-
"confidence_scores": detected_confidences,
|
313 |
-
"unique_objects": unique_objects,
|
314 |
-
"unique_confidence_scores": unique_confidences,
|
315 |
-
}
|
316 |
-
)
|
317 |
-
|
318 |
-
except requests.RequestException as e:
|
319 |
-
logger.error(f"Error fetching image from URL: {str(e)}")
|
320 |
-
raise HTTPException(status_code=400, detail=f"Failed to fetch image: {str(e)}")
|
321 |
except Exception as e:
|
322 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
323 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
@@ -328,7 +304,7 @@ async def detect_objects_endpoint(
|
|
328 |
|
329 |
def create_gradio_ui() -> gr.Blocks:
|
330 |
"""
|
331 |
-
Create and configure the Gradio UI for object detection.
|
332 |
|
333 |
Returns:
|
334 |
Gradio Blocks object representing the UI.
|
@@ -337,257 +313,126 @@ def create_gradio_ui() -> gr.Blocks:
|
|
337 |
RuntimeError: If UI creation fails.
|
338 |
"""
|
339 |
try:
|
340 |
-
|
|
|
|
|
341 |
gr.Markdown(
|
342 |
f"""
|
343 |
# 🚀 Object Detection App
|
344 |
-
Upload an image or provide a URL to detect objects using
|
345 |
Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
|
346 |
"""
|
347 |
)
|
348 |
|
|
|
349 |
with gr.Tabs():
|
350 |
-
|
|
|
351 |
with gr.Row():
|
|
|
352 |
with gr.Column(scale=1):
|
353 |
gr.Markdown("### Input")
|
354 |
-
|
355 |
-
|
356 |
-
|
357 |
-
|
358 |
-
info="Choose a model for object detection or panoptic segmentation.",
|
359 |
-
)
|
360 |
-
model_info = gr.Markdown(
|
361 |
-
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
362 |
-
visible=True,
|
363 |
-
)
|
364 |
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
365 |
-
|
366 |
-
|
367 |
-
|
368 |
-
)
|
369 |
with gr.Row():
|
370 |
submit_btn = gr.Button("✨ Detect", variant="primary")
|
371 |
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
372 |
|
|
|
373 |
model_choice.change(
|
374 |
-
fn=lambda model_name: (
|
375 |
-
f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
|
376 |
-
),
|
377 |
inputs=model_choice,
|
378 |
outputs=model_info,
|
379 |
)
|
380 |
|
|
|
381 |
with gr.Column(scale=2):
|
382 |
gr.Markdown("### Results")
|
383 |
-
|
384 |
-
|
385 |
-
|
386 |
-
|
387 |
-
|
388 |
-
)
|
389 |
-
output_image = gr.Image(
|
390 |
-
type="pil",
|
391 |
-
label="🎯 Detected Image",
|
392 |
-
interactive=False,
|
393 |
-
)
|
394 |
with gr.Row():
|
395 |
-
objects_output = gr.DataFrame(
|
396 |
-
|
397 |
-
|
398 |
-
|
399 |
-
)
|
400 |
-
unique_objects_output = gr.DataFrame(
|
401 |
-
label="🔍 Unique Objects",
|
402 |
-
interactive=False,
|
403 |
-
value=None,
|
404 |
-
)
|
405 |
-
properties_output = gr.DataFrame(
|
406 |
-
label="📄 Image Properties",
|
407 |
-
interactive=False,
|
408 |
-
value=None,
|
409 |
-
)
|
410 |
-
|
411 |
-
def process_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Tuple[
|
412 |
-
Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str
|
413 |
-
]:
|
414 |
-
"""
|
415 |
-
Process image for Gradio UI and return results.
|
416 |
-
|
417 |
-
Args:
|
418 |
-
image: Uploaded PIL Image (optional).
|
419 |
-
url: Image URL (optional).
|
420 |
-
model_name: Model to use for detection.
|
421 |
-
|
422 |
-
Returns:
|
423 |
-
Tuple of detected image, objects DataFrame, unique objects DataFrame, properties DataFrame, and error message.
|
424 |
-
"""
|
425 |
-
try:
|
426 |
-
if image is None and not url:
|
427 |
-
return None, None, None, None, "Please provide an image or URL"
|
428 |
-
if image and url:
|
429 |
-
return None, None, None, None, "Please provide either an image or URL, not both"
|
430 |
-
|
431 |
-
if url:
|
432 |
-
response = requests.get(url, timeout=10)
|
433 |
-
response.raise_for_status()
|
434 |
-
image = Image.open(BytesIO(response.content)).convert("RGB")
|
435 |
-
|
436 |
-
detected_image, objects, scores, unique_objects, unique_scores, properties = process(
|
437 |
-
image, model_name
|
438 |
-
)
|
439 |
-
objects_df = (
|
440 |
-
pd.DataFrame(
|
441 |
-
{
|
442 |
-
"Object": objects,
|
443 |
-
"Confidence Score": [f"{score:.2f}" for score in scores],
|
444 |
-
}
|
445 |
-
)
|
446 |
-
if objects
|
447 |
-
else pd.DataFrame(columns=["Object", "Confidence Score"])
|
448 |
-
)
|
449 |
-
unique_objects_df = (
|
450 |
-
pd.DataFrame(
|
451 |
-
{
|
452 |
-
"Unique Object": unique_objects,
|
453 |
-
"Confidence Score": [f"{score:.2f}" for score in unique_scores],
|
454 |
-
}
|
455 |
-
)
|
456 |
-
if unique_objects
|
457 |
-
else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
458 |
-
)
|
459 |
-
properties_df = (
|
460 |
-
pd.DataFrame([properties])
|
461 |
-
if properties
|
462 |
-
else pd.DataFrame(columns=properties.keys())
|
463 |
-
)
|
464 |
-
return detected_image, objects_df, unique_objects_df, properties_df, ""
|
465 |
-
|
466 |
-
except requests.RequestException as e:
|
467 |
-
error_msg = f"Error fetching image from URL: {str(e)}"
|
468 |
-
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
469 |
-
return None, None, None, None, error_msg
|
470 |
-
except Exception as e:
|
471 |
-
error_msg = f"Error processing image: {str(e)}"
|
472 |
-
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
473 |
-
return None, None, None, None, error_msg
|
474 |
|
|
|
475 |
submit_btn.click(
|
476 |
-
fn=
|
477 |
inputs=[image_input, image_url_input, model_choice],
|
478 |
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
|
479 |
)
|
480 |
|
|
|
481 |
clear_btn.click(
|
482 |
fn=lambda: [None, "", None, None, None, None],
|
483 |
inputs=None,
|
484 |
-
outputs=[
|
485 |
-
image_input,
|
486 |
-
image_url_input,
|
487 |
-
output_image,
|
488 |
-
objects_output,
|
489 |
-
unique_objects_output,
|
490 |
-
properties_output,
|
491 |
-
error_output,
|
492 |
-
],
|
493 |
)
|
494 |
|
495 |
-
|
496 |
-
|
497 |
-
|
498 |
-
|
499 |
-
|
500 |
-
|
501 |
-
|
502 |
-
|
503 |
-
|
504 |
-
|
505 |
-
|
506 |
-
|
507 |
-
|
508 |
-
|
509 |
-
|
510 |
-
|
511 |
-
|
512 |
-
|
513 |
-
|
514 |
-
|
515 |
-
|
516 |
-
f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
|
517 |
-
),
|
518 |
-
inputs=url_model_choice,
|
519 |
-
outputs=url_model_info,
|
520 |
-
)
|
521 |
-
|
522 |
-
def process_url_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Dict:
|
523 |
-
"""
|
524 |
-
Process image from file or URL for Gradio UI and return JSON response.
|
525 |
-
|
526 |
-
Args:
|
527 |
-
image: Uploaded PIL Image (optional).
|
528 |
-
url: Image URL (optional).
|
529 |
-
model_name: Model to use for detection.
|
530 |
-
|
531 |
-
Returns:
|
532 |
-
Dictionary with processed image (base64), detected objects, and confidences.
|
533 |
-
"""
|
534 |
-
try:
|
535 |
-
if image is None and not url:
|
536 |
-
return {"error": "Please provide an image or URL"}
|
537 |
-
if image and url:
|
538 |
-
return {"error": "Please provide either an image or URL, not both"}
|
539 |
-
|
540 |
-
if url:
|
541 |
-
response = requests.get(url, timeout=10)
|
542 |
-
response.raise_for_status()
|
543 |
-
image = Image.open(BytesIO(response.content)).convert("RGB")
|
544 |
-
|
545 |
-
detected_image, objects, scores, unique_objects, unique_scores, _ = process(
|
546 |
-
image, model_name
|
547 |
)
|
548 |
-
buffered = BytesIO()
|
549 |
-
detected_image.save(buffered, format="PNG")
|
550 |
-
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
551 |
-
return {
|
552 |
-
"image_url": f"data:image/png;base64,{img_base64}",
|
553 |
-
"detected_objects": objects,
|
554 |
-
"confidence_scores": scores,
|
555 |
-
"unique_objects": unique_objects,
|
556 |
-
"unique_confidence_scores": unique_scores,
|
557 |
-
}
|
558 |
-
except requests.RequestException as e:
|
559 |
-
error_msg = f"Error fetching image from URL: {str(e)}"
|
560 |
-
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
561 |
-
return {"error": error_msg}
|
562 |
-
except Exception as e:
|
563 |
-
error_msg = f"Error processing image: {str(e)}"
|
564 |
-
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
565 |
-
return {"error": error_msg}
|
566 |
|
|
|
|
|
|
|
|
|
|
|
|
|
567 |
url_submit_btn.click(
|
568 |
-
fn=
|
569 |
inputs=[image_input_json, image_url_input_json, url_model_choice],
|
570 |
outputs=[url_output],
|
571 |
)
|
572 |
|
|
|
573 |
with gr.Tab("ℹ️ Help"):
|
574 |
gr.Markdown(
|
575 |
"""
|
576 |
## How to Use
|
577 |
-
- **
|
578 |
-
- **JSON
|
579 |
-
- **Models**: Choose
|
580 |
-
- **Clear**: Reset
|
581 |
-
- **Errors**: Check
|
582 |
-
|
583 |
## Tips
|
584 |
-
- Use high-quality images for better
|
585 |
-
- Panoptic models
|
586 |
-
-
|
587 |
"""
|
588 |
)
|
589 |
|
590 |
-
return
|
591 |
|
592 |
except Exception as e:
|
593 |
logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
|
@@ -599,38 +444,25 @@ def create_gradio_ui() -> gr.Blocks:
|
|
599 |
|
600 |
def parse_args() -> argparse.Namespace:
|
601 |
"""
|
602 |
-
Parse command-line arguments
|
603 |
|
604 |
Returns:
|
605 |
Parsed arguments as a Namespace object.
|
606 |
-
|
607 |
-
Raises:
|
608 |
-
SystemExit: If argument parsing fails (handled by argparse).
|
609 |
"""
|
610 |
-
parser = argparse.ArgumentParser(
|
611 |
-
|
612 |
-
)
|
613 |
-
|
614 |
-
|
615 |
-
|
616 |
-
|
617 |
-
|
618 |
-
)
|
619 |
-
|
620 |
-
"--enable-fastapi",
|
621 |
-
action="store_true",
|
622 |
-
default=False,
|
623 |
-
help="Enable the FastAPI server (disabled by default).",
|
624 |
-
)
|
625 |
-
parser.add_argument(
|
626 |
-
"--fastapi-port",
|
627 |
-
type=int,
|
628 |
-
default=DEFAULT_FASTAPI_PORT,
|
629 |
-
help=f"Port for the FastAPI server if enabled (default: {DEFAULT_FASTAPI_PORT}).",
|
630 |
-
)
|
631 |
-
|
632 |
-
# Parse known arguments and ignore unrecognized ones (e.g., Jupyter kernel args)
|
633 |
args, _ = parser.parse_known_args()
|
|
|
|
|
|
|
634 |
return args
|
635 |
|
636 |
def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
|
@@ -638,30 +470,21 @@ def find_available_port(start_port: int, port_range: range, max_attempts: int) -
|
|
638 |
Find an available port within the specified range.
|
639 |
|
640 |
Args:
|
641 |
-
start_port: Initial port to try
|
642 |
port_range: Range of ports to attempt.
|
643 |
max_attempts: Maximum number of ports to try.
|
644 |
|
645 |
Returns:
|
646 |
Available port number, or None if no port is found.
|
647 |
-
|
648 |
-
Raises:
|
649 |
-
OSError: If port binding fails for reasons other than port in use.
|
650 |
"""
|
651 |
import socket
|
652 |
-
|
653 |
-
port = start_port
|
654 |
attempts = 0
|
655 |
-
|
656 |
-
# Check environment variable GRADIO_SERVER_PORT
|
657 |
-
env_port = os.getenv("GRADIO_SERVER_PORT")
|
658 |
-
if env_port and env_port.isdigit():
|
659 |
-
port = int(env_port)
|
660 |
-
logger.info(f"Using GRADIO_SERVER_PORT from environment: {port}")
|
661 |
-
|
662 |
while attempts < max_attempts:
|
663 |
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
664 |
try:
|
|
|
665 |
s.bind(("0.0.0.0", port))
|
666 |
logger.debug(f"Port {port} is available")
|
667 |
return port
|
@@ -672,70 +495,47 @@ def find_available_port(start_port: int, port_range: range, max_attempts: int) -
|
|
672 |
attempts += 1
|
673 |
else:
|
674 |
raise
|
675 |
-
|
676 |
-
logger.error(f"Error checking port {port}: {str(e)}")
|
677 |
-
raise
|
678 |
-
logger.error(f"No available port found in range {min(port_range)}-{max(port_range)} after {max_attempts} attempts")
|
679 |
return None
|
680 |
|
681 |
-
def run_fastapi_server(host: str, port: int) -> None:
|
682 |
-
"""
|
683 |
-
Run the FastAPI server using Uvicorn.
|
684 |
-
|
685 |
-
Args:
|
686 |
-
host: Host address for the FastAPI server.
|
687 |
-
port: Port for the FastAPI server.
|
688 |
-
"""
|
689 |
-
try:
|
690 |
-
uvicorn.run(app, host=host, port=port)
|
691 |
-
except Exception as e:
|
692 |
-
logger.error(f"Error running FastAPI server: {str(e)}\n{traceback.format_exc()}")
|
693 |
-
sys.exit(1)
|
694 |
-
|
695 |
def main() -> None:
|
696 |
"""
|
697 |
-
|
698 |
|
699 |
Raises:
|
700 |
-
SystemExit:
|
701 |
"""
|
702 |
try:
|
703 |
-
# Apply nest_asyncio
|
704 |
nest_asyncio.apply()
|
705 |
-
|
706 |
# Parse command-line arguments
|
707 |
args = parse_args()
|
708 |
logger.info(f"Parsed arguments: {args}")
|
709 |
-
|
710 |
# Find available port for Gradio
|
711 |
gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
|
712 |
if gradio_port is None:
|
713 |
logger.error("Failed to find an available port for Gradio UI")
|
714 |
sys.exit(1)
|
715 |
|
716 |
-
#
|
717 |
if args.enable_fastapi:
|
718 |
-
logger.info(f"Starting FastAPI
|
719 |
fastapi_thread = threading.Thread(
|
720 |
-
target=
|
721 |
-
args=("0.0.0.0", args.fastapi_port),
|
722 |
daemon=True
|
723 |
)
|
724 |
fastapi_thread.start()
|
725 |
|
726 |
# Launch Gradio UI
|
727 |
logger.info(f"Starting Gradio UI on port {gradio_port}")
|
728 |
-
|
729 |
-
|
730 |
|
731 |
except KeyboardInterrupt:
|
732 |
logger.info("Application terminated by user.")
|
733 |
sys.exit(0)
|
734 |
-
except OSError as e:
|
735 |
-
logger.error(f"Port binding error: {str(e)}")
|
736 |
-
sys.exit(1)
|
737 |
except Exception as e:
|
738 |
-
logger.error(f"Error
|
739 |
sys.exit(1)
|
740 |
|
741 |
if __name__ == "__main__":
|
|
|
3 |
import logging
|
4 |
import os
|
5 |
import sys
|
|
|
6 |
import threading
|
7 |
from collections import Counter
|
8 |
from io import BytesIO
|
9 |
+
from typing import Dict, List, Optional, Tuple, Union
|
10 |
|
11 |
import gradio as gr
|
12 |
import pandas as pd
|
|
|
29 |
# Configuration
|
30 |
# ------------------------------
|
31 |
|
32 |
+
# Configure logging for debugging and monitoring
|
33 |
+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
|
|
|
|
|
|
34 |
logger = logging.getLogger(__name__)
|
35 |
|
36 |
+
# Define constants for model and server configuration
|
37 |
+
CONFIDENCE_THRESHOLD: float = 0.5 # Default threshold for object detection confidence
|
38 |
VALID_MODELS: List[str] = [
|
39 |
"facebook/detr-resnet-50",
|
40 |
"facebook/detr-resnet-101",
|
|
|
44 |
"hustvl/yolos-base",
|
45 |
]
|
46 |
MODEL_DESCRIPTIONS: Dict[str, str] = {
|
47 |
+
"facebook/detr-resnet-50": "DETR with ResNet-50 for object detection. Fast and accurate.",
|
48 |
+
"facebook/detr-resnet-101": "DETR with ResNet-101 for object detection. More accurate, slower.",
|
49 |
+
"facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation.",
|
50 |
+
"facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation.",
|
51 |
+
"hustvl/yolos-tiny": "YOLOS Tiny. Lightweight and fast.",
|
52 |
+
"hustvl/yolos-base": "YOLOS Base. Balances speed and accuracy."
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
}
|
54 |
+
DEFAULT_GRADIO_PORT: int = 7860 # Default port for Gradio UI
|
55 |
+
DEFAULT_FASTAPI_PORT: int = 8000 # Default port for FastAPI server
|
56 |
+
PORT_RANGE: range = range(7860, 7870) # Range of ports to try for Gradio
|
57 |
+
MAX_PORT_ATTEMPTS: int = 10 # Maximum attempts to find an available port
|
|
|
|
|
58 |
|
59 |
# Thread-safe storage for lazy-loaded models and processors
|
60 |
models: Dict[str, any] = {}
|
61 |
processors: Dict[str, any] = {}
|
62 |
model_lock = threading.Lock()
|
63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
# ------------------------------
|
65 |
# Image Processing
|
66 |
# ------------------------------
|
67 |
|
68 |
+
def process_image(
|
69 |
+
image: Optional[Image.Image],
|
70 |
+
url: Optional[str],
|
71 |
+
model_name: str,
|
72 |
+
for_json: bool = False,
|
73 |
+
confidence_threshold: float = CONFIDENCE_THRESHOLD
|
74 |
+
) -> Union[Dict, Tuple[Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str]]:
|
75 |
"""
|
76 |
+
Process an image for object detection or panoptic segmentation, handling Gradio and FastAPI inputs.
|
77 |
|
78 |
Args:
|
79 |
+
image: PIL Image object from file upload (optional).
|
80 |
+
url: URL of the image to process (optional).
|
81 |
model_name: Name of the model to use (must be in VALID_MODELS).
|
82 |
+
for_json: If True, return JSON dict for API/JSON tab; else, return tuple for Gradio Home tab.
|
83 |
+
confidence_threshold: Minimum confidence score for detection (default: 0.5).
|
84 |
|
85 |
Returns:
|
86 |
+
For JSON: Dict with base64-encoded image, detected objects, and confidence scores.
|
87 |
+
For Gradio: Tuple of (annotated image, objects DataFrame, unique objects DataFrame, properties DataFrame, error message).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
"""
|
|
|
|
|
|
|
89 |
try:
|
90 |
+
# Validate input: ensure exactly one of image or URL is provided
|
91 |
+
if image is None and not url:
|
92 |
+
return {"error": "Please provide an image or URL"} if for_json else (None, None, None, None, "Please provide an image or URL")
|
93 |
+
if image and url:
|
94 |
+
return {"error": "Provide either an image or URL, not both"} if for_json else (None, None, None, None, "Provide either an image or URL, not both")
|
95 |
+
if model_name not in VALID_MODELS:
|
96 |
+
error_msg = f"Invalid model: {model_name}. Choose from: {VALID_MODELS}"
|
97 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
98 |
|
99 |
+
# Calculate margin threshold: (1 - confidence_threshold) / 2 + confidence_threshold
|
100 |
+
margin_threshold = (1 - confidence_threshold) / 2 + confidence_threshold
|
101 |
+
|
102 |
+
# Load image from URL if provided
|
103 |
+
if url:
|
104 |
+
response = requests.get(url, timeout=10)
|
105 |
+
response.raise_for_status()
|
106 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
107 |
+
|
108 |
+
# Load model and processor thread-safely
|
109 |
+
with model_lock:
|
110 |
+
if model_name not in models:
|
111 |
+
logger.info(f"Loading model: {model_name}")
|
112 |
+
try:
|
113 |
+
# Select appropriate model and processor based on model name
|
114 |
+
if "yolos" in model_name:
|
115 |
+
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
116 |
+
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
117 |
+
elif "panoptic" in model_name:
|
118 |
+
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
119 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
120 |
+
else:
|
121 |
+
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
122 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
123 |
+
except Exception as e:
|
124 |
+
error_msg = f"Failed to load model: {str(e)}"
|
125 |
+
logger.error(error_msg)
|
126 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
127 |
+
model, processor = models[model_name], processors[model_name]
|
128 |
+
|
129 |
+
# Prepare image for model processing
|
130 |
inputs = processor(images=image, return_tensors="pt")
|
131 |
with torch.no_grad():
|
132 |
outputs = model(**inputs)
|
133 |
|
134 |
+
# Initialize drawing context for annotations
|
135 |
draw = ImageDraw.Draw(image)
|
136 |
object_names: List[str] = []
|
137 |
confidence_scores: List[float] = []
|
138 |
object_counter = Counter()
|
139 |
target_sizes = torch.tensor([image.size[::-1]])
|
140 |
|
141 |
+
# Process results based on model type (panoptic or object detection)
|
142 |
if "panoptic" in model_name:
|
143 |
+
# Handle panoptic segmentation
|
144 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
145 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
|
|
146 |
for segment in results["segments_info"]:
|
147 |
label = segment["label_id"]
|
148 |
label_name = model.config.id2label.get(label, "Unknown")
|
149 |
score = segment.get("score", 1.0)
|
|
|
150 |
# Apply segmentation mask if available
|
151 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
152 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
|
|
158 |
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
159 |
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
160 |
draw = ImageDraw.Draw(image)
|
161 |
+
if score > confidence_threshold:
|
|
|
162 |
object_names.append(label_name)
|
163 |
confidence_scores.append(float(score))
|
164 |
object_counter[label_name] = float(score)
|
165 |
else:
|
166 |
+
# Handle object detection
|
167 |
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
|
|
168 |
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
169 |
+
if score > confidence_threshold:
|
170 |
x, y, x2, y2 = box.tolist()
|
|
|
171 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
172 |
text = f"{label_name}: {score:.2f}"
|
173 |
text_bbox = draw.textbbox((0, 0), text)
|
174 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
175 |
+
# Use yellow for confidence_threshold <= score < margin_threshold, green for >= margin_threshold
|
176 |
+
color = "#FFFF00" if score < margin_threshold else "#32CD32"
|
177 |
+
draw.rectangle([x, y, x2, y2], outline=color, width=2)
|
178 |
+
draw.text((x2 - text_width - 2, y - text_height - 2), text, fill=color)
|
179 |
object_names.append(label_name)
|
180 |
confidence_scores.append(float(score))
|
181 |
object_counter[label_name] = float(score)
|
182 |
|
183 |
+
# Compile unique objects and their highest confidence scores
|
184 |
unique_objects = list(object_counter.keys())
|
185 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
186 |
|
187 |
+
# Calculate image properties (metadata)
|
188 |
properties: Dict[str, str] = {
|
189 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
190 |
"Size": f"{image.width}x{image.height}",
|
191 |
"Width": f"{image.width} px",
|
192 |
"Height": f"{image.height} px",
|
193 |
"Mode": image.mode,
|
194 |
+
"Aspect Ratio": f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined",
|
|
|
|
|
195 |
"File Size": "Unknown",
|
196 |
"Mean (R,G,B)": "Unknown",
|
197 |
"StdDev (R,G,B)": "Unknown",
|
198 |
}
|
|
|
|
|
199 |
try:
|
200 |
+
# Compute file size
|
201 |
buffered = BytesIO()
|
202 |
image.save(buffered, format="PNG")
|
203 |
properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
204 |
+
# Compute color statistics
|
|
|
|
|
|
|
|
|
205 |
stat = ImageStat.Stat(image)
|
206 |
properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
|
207 |
properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
|
208 |
except Exception as e:
|
209 |
+
logger.error(f"Error calculating image stats: {str(e)}")
|
210 |
|
211 |
+
# Prepare output based on request type
|
212 |
+
if for_json:
|
213 |
+
# Return JSON with base64-encoded image
|
214 |
+
buffered = BytesIO()
|
215 |
+
image.save(buffered, format="PNG")
|
216 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
217 |
+
return {
|
218 |
+
"image_url": f"data:image/png;base64,{img_base64}",
|
219 |
+
"detected_objects": object_names,
|
220 |
+
"confidence_scores": confidence_scores,
|
221 |
+
"unique_objects": unique_objects,
|
222 |
+
"unique_confidence_scores": unique_confidences,
|
223 |
+
}
|
224 |
+
else:
|
225 |
+
# Return tuple for Gradio Home tab with DataFrames
|
226 |
+
objects_df = (
|
227 |
+
pd.DataFrame({"Object": object_names, "Confidence Score": [f"{score:.2f}" for score in confidence_scores]})
|
228 |
+
if object_names else pd.DataFrame(columns=["Object", "Confidence Score"])
|
229 |
+
)
|
230 |
+
unique_objects_df = (
|
231 |
+
pd.DataFrame({"Unique Object": unique_objects, "Confidence Score": [f"{score:.2f}" for score in unique_confidences]})
|
232 |
+
if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
233 |
+
)
|
234 |
+
properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
|
235 |
+
return image, objects_df, unique_objects_df, properties_df, ""
|
236 |
|
237 |
+
except requests.RequestException as e:
|
238 |
+
# Handle URL fetch errors
|
239 |
+
error_msg = f"Error fetching image from URL: {str(e)}"
|
240 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
241 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
242 |
except Exception as e:
|
243 |
+
# Handle general processing errors
|
244 |
+
error_msg = f"Error processing image: {str(e)}"
|
245 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
246 |
+
return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
|
247 |
|
248 |
# ------------------------------
|
249 |
# FastAPI Setup
|
|
|
256 |
file: Optional[UploadFile] = File(None),
|
257 |
image_url: Optional[str] = Form(None),
|
258 |
model_name: str = Form(VALID_MODELS[0]),
|
259 |
+
confidence_threshold: float = Form(CONFIDENCE_THRESHOLD),
|
260 |
) -> JSONResponse:
|
261 |
"""
|
262 |
FastAPI endpoint to detect objects in an image from file upload or URL.
|
|
|
265 |
file: Uploaded image file (optional).
|
266 |
image_url: URL of the image (optional).
|
267 |
model_name: Model to use for detection (default: first VALID_MODELS entry).
|
268 |
+
confidence_threshold: Confidence threshold for detection (default: 0.5).
|
269 |
|
270 |
Returns:
|
271 |
+
JSONResponse with base64-encoded image, detected objects, and confidence scores.
|
272 |
|
273 |
Raises:
|
274 |
+
HTTPException: For invalid inputs or processing errors.
|
275 |
"""
|
276 |
try:
|
277 |
+
# Validate input: ensure exactly one of file or URL
|
278 |
if (file is None and not image_url) or (file is not None and image_url):
|
279 |
+
raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, not both.")
|
280 |
+
# Validate confidence threshold
|
281 |
+
if not 0 <= confidence_threshold <= 1:
|
282 |
+
raise HTTPException(status_code=400, detail="Confidence threshold must be between 0 and 1.")
|
283 |
+
# Load image from file if provided
|
284 |
+
image = None
|
285 |
if file:
|
286 |
if not file.content_type.startswith("image/"):
|
287 |
raise HTTPException(status_code=400, detail="File must be an image")
|
288 |
contents = await file.read()
|
289 |
image = Image.open(BytesIO(contents)).convert("RGB")
|
290 |
+
# Process image with specified parameters
|
291 |
+
result = process_image(image, image_url, model_name, for_json=True, confidence_threshold=confidence_threshold)
|
292 |
+
if "error" in result:
|
293 |
+
raise HTTPException(status_code=400, detail=result["error"])
|
294 |
+
return JSONResponse(content=result)
|
295 |
+
except HTTPException:
|
296 |
+
raise
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
297 |
except Exception as e:
|
298 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
299 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
|
|
304 |
|
305 |
def create_gradio_ui() -> gr.Blocks:
|
306 |
"""
|
307 |
+
Create and configure the Gradio UI for object detection with Home, JSON, and Help tabs.
|
308 |
|
309 |
Returns:
|
310 |
Gradio Blocks object representing the UI.
|
|
|
313 |
RuntimeError: If UI creation fails.
|
314 |
"""
|
315 |
try:
|
316 |
+
# Initialize Gradio Blocks with a custom theme
|
317 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
|
318 |
+
# Display app header
|
319 |
gr.Markdown(
|
320 |
f"""
|
321 |
# 🚀 Object Detection App
|
322 |
+
Upload an image or provide a URL to detect objects using transformer models (DETR, YOLOS).
|
323 |
Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
|
324 |
"""
|
325 |
)
|
326 |
|
327 |
+
# Create tabbed interface
|
328 |
with gr.Tabs():
|
329 |
+
# Home tab (formerly Image Upload)
|
330 |
+
with gr.Tab("🏠 Home"):
|
331 |
with gr.Row():
|
332 |
+
# Left column for inputs
|
333 |
with gr.Column(scale=1):
|
334 |
gr.Markdown("### Input")
|
335 |
+
# Model selection dropdown
|
336 |
+
model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
|
337 |
+
model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
|
338 |
+
# Image upload input
|
|
|
|
|
|
|
|
|
|
|
|
|
339 |
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
340 |
+
# Image URL input
|
341 |
+
image_url_input = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
|
342 |
+
# Buttons for submission and clearing
|
|
|
343 |
with gr.Row():
|
344 |
submit_btn = gr.Button("✨ Detect", variant="primary")
|
345 |
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
346 |
|
347 |
+
# Update model info when model changes
|
348 |
model_choice.change(
|
349 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
|
|
|
|
350 |
inputs=model_choice,
|
351 |
outputs=model_info,
|
352 |
)
|
353 |
|
354 |
+
# Right column for results
|
355 |
with gr.Column(scale=2):
|
356 |
gr.Markdown("### Results")
|
357 |
+
# Error display (hidden by default)
|
358 |
+
error_output = gr.Textbox(label="⚠️ Errors", visible=False, lines=3, max_lines=5)
|
359 |
+
# Annotated image output
|
360 |
+
output_image = gr.Image(type="pil", label="🎯 Detected Image", interactive=False)
|
361 |
+
# Detected and unique objects tables
|
|
|
|
|
|
|
|
|
|
|
|
|
362 |
with gr.Row():
|
363 |
+
objects_output = gr.DataFrame(label="📋 Detected Objects", interactive=False)
|
364 |
+
unique_objects_output = gr.DataFrame(label="🔍 Unique Objects", interactive=False)
|
365 |
+
# Image properties table
|
366 |
+
properties_output = gr.DataFrame(label="📄 Image Properties", interactive=False)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
367 |
|
368 |
+
# Process image when Detect button is clicked
|
369 |
submit_btn.click(
|
370 |
+
fn=process_image,
|
371 |
inputs=[image_input, image_url_input, model_choice],
|
372 |
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
|
373 |
)
|
374 |
|
375 |
+
# Clear all inputs and outputs
|
376 |
clear_btn.click(
|
377 |
fn=lambda: [None, "", None, None, None, None],
|
378 |
inputs=None,
|
379 |
+
outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output],
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
380 |
)
|
381 |
|
382 |
+
# JSON tab for API-like output
|
383 |
+
with gr.Tab("🔗 JSON"):
|
384 |
+
with gr.Row():
|
385 |
+
# Left column for inputs
|
386 |
+
with gr.Column(scale=1):
|
387 |
+
gr.Markdown("### Process Image for JSON")
|
388 |
+
# Model selection dropdown
|
389 |
+
url_model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
|
390 |
+
url_model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
|
391 |
+
# Image upload input
|
392 |
+
image_input_json = gr.Image(type="pil", label="📷 Upload Image")
|
393 |
+
# Image URL input
|
394 |
+
image_url_input_json = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
|
395 |
+
# Process button
|
396 |
+
url_submit_btn = gr.Button("🔄 Process", variant="primary")
|
397 |
+
|
398 |
+
# Update model info when model changes
|
399 |
+
url_model_choice.change(
|
400 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
401 |
+
inputs=url_model_choice,
|
402 |
+
outputs=url_model_info,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
403 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
404 |
|
405 |
+
# Right column for JSON output
|
406 |
+
with gr.Column(scale=1):
|
407 |
+
# JSON output display
|
408 |
+
url_output = gr.JSON(label="API Response")
|
409 |
+
|
410 |
+
# Process image and return JSON when Process button is clicked
|
411 |
url_submit_btn.click(
|
412 |
+
fn=lambda img, url, model: process_image(img, url, model, for_json=True),
|
413 |
inputs=[image_input_json, image_url_input_json, url_model_choice],
|
414 |
outputs=[url_output],
|
415 |
)
|
416 |
|
417 |
+
# Help tab with usage instructions
|
418 |
with gr.Tab("ℹ️ Help"):
|
419 |
gr.Markdown(
|
420 |
"""
|
421 |
## How to Use
|
422 |
+
- **Home**: Select a model, upload an image or provide a URL, click "Detect" to see results.
|
423 |
+
- **JSON**: Select a model, upload an image or enter a URL, click "Process" for JSON output.
|
424 |
+
- **Models**: Choose DETR (detection or panoptic) or YOLOS (lightweight detection).
|
425 |
+
- **Clear**: Reset inputs/outputs in Home tab.
|
426 |
+
- **Errors**: Check error box (Home) or JSON response (JSON) for issues.
|
427 |
+
|
428 |
## Tips
|
429 |
+
- Use high-quality images for better results.
|
430 |
+
- Panoptic models provide segmentation masks for complex scenes.
|
431 |
+
- YOLOS-Tiny is faster for resource-constrained devices.
|
432 |
"""
|
433 |
)
|
434 |
|
435 |
+
return demo
|
436 |
|
437 |
except Exception as e:
|
438 |
logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
|
|
|
444 |
|
445 |
def parse_args() -> argparse.Namespace:
|
446 |
"""
|
447 |
+
Parse command-line arguments for configuring the application.
|
448 |
|
449 |
Returns:
|
450 |
Parsed arguments as a Namespace object.
|
|
|
|
|
|
|
451 |
"""
|
452 |
+
parser = argparse.ArgumentParser(description="Object Detection App with Gradio and FastAPI.")
|
453 |
+
# Gradio port argument
|
454 |
+
parser.add_argument("--gradio-port", type=int, default=DEFAULT_GRADIO_PORT, help=f"Gradio port (default: {DEFAULT_GRADIO_PORT}).")
|
455 |
+
# FastAPI enable flag
|
456 |
+
parser.add_argument("--enable-fastapi", action="store_true", help="Enable FastAPI server.")
|
457 |
+
# FastAPI port argument
|
458 |
+
parser.add_argument("--fastapi-port", type=int, default=DEFAULT_FASTAPI_PORT, help=f"FastAPI port (default: {DEFAULT_FASTAPI_PORT}).")
|
459 |
+
# Confidence threshold argument
|
460 |
+
parser.add_argument("--confidence-threshold", type=float, default=CONFIDENCE_THRESHOLD, help="Confidence threshold for detection (default: 0.5).")
|
461 |
+
# Parse known arguments, ignoring unrecognized ones
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
462 |
args, _ = parser.parse_known_args()
|
463 |
+
# Validate confidence threshold
|
464 |
+
if not 0 <= args.confidence_threshold <= 1:
|
465 |
+
parser.error("Confidence threshold must be between 0 and 1.")
|
466 |
return args
|
467 |
|
468 |
def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
|
|
|
470 |
Find an available port within the specified range.
|
471 |
|
472 |
Args:
|
473 |
+
start_port: Initial port to try.
|
474 |
port_range: Range of ports to attempt.
|
475 |
max_attempts: Maximum number of ports to try.
|
476 |
|
477 |
Returns:
|
478 |
Available port number, or None if no port is found.
|
|
|
|
|
|
|
479 |
"""
|
480 |
import socket
|
481 |
+
# Check environment variable for port override
|
482 |
+
port = int(os.getenv("GRADIO_SERVER_PORT", start_port))
|
483 |
attempts = 0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
484 |
while attempts < max_attempts:
|
485 |
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
486 |
try:
|
487 |
+
# Attempt to bind to the port
|
488 |
s.bind(("0.0.0.0", port))
|
489 |
logger.debug(f"Port {port} is available")
|
490 |
return port
|
|
|
495 |
attempts += 1
|
496 |
else:
|
497 |
raise
|
498 |
+
logger.error(f"No available port in range {min(port_range)}-{max(port_range)}")
|
|
|
|
|
|
|
499 |
return None
|
500 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
501 |
def main() -> None:
|
502 |
"""
|
503 |
+
Launch the Gradio UI and optional FastAPI server.
|
504 |
|
505 |
Raises:
|
506 |
+
SystemExit: On interruption or critical errors.
|
507 |
"""
|
508 |
try:
|
509 |
+
# Apply nest_asyncio for compatibility with Jupyter/Colab
|
510 |
nest_asyncio.apply()
|
|
|
511 |
# Parse command-line arguments
|
512 |
args = parse_args()
|
513 |
logger.info(f"Parsed arguments: {args}")
|
|
|
514 |
# Find available port for Gradio
|
515 |
gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
|
516 |
if gradio_port is None:
|
517 |
logger.error("Failed to find an available port for Gradio UI")
|
518 |
sys.exit(1)
|
519 |
|
520 |
+
# Start FastAPI server in a thread if enabled
|
521 |
if args.enable_fastapi:
|
522 |
+
logger.info(f"Starting FastAPI on port {args.fastapi_port}")
|
523 |
fastapi_thread = threading.Thread(
|
524 |
+
target=lambda: uvicorn.run(app, host="0.0.0.0", port=args.fastapi_port),
|
|
|
525 |
daemon=True
|
526 |
)
|
527 |
fastapi_thread.start()
|
528 |
|
529 |
# Launch Gradio UI
|
530 |
logger.info(f"Starting Gradio UI on port {gradio_port}")
|
531 |
+
demo = create_gradio_ui()
|
532 |
+
demo.launch(server_port=gradio_port, server_name="0.0.0.0")
|
533 |
|
534 |
except KeyboardInterrupt:
|
535 |
logger.info("Application terminated by user.")
|
536 |
sys.exit(0)
|
|
|
|
|
|
|
537 |
except Exception as e:
|
538 |
+
logger.error(f"Error: {str(e)}\n{traceback.format_exc()}")
|
539 |
sys.exit(1)
|
540 |
|
541 |
if __name__ == "__main__":
|
hf_space/app.py
CHANGED
@@ -1,79 +1,166 @@
|
|
1 |
-
import
|
2 |
-
import torch
|
3 |
-
from transformers import DetrImageProcessor, DetrForObjectDetection
|
4 |
-
from transformers import YolosImageProcessor, YolosForObjectDetection
|
5 |
-
from transformers import DetrForSegmentation
|
6 |
-
from PIL import Image, ImageDraw, ImageStat
|
7 |
-
import requests
|
8 |
-
from io import BytesIO
|
9 |
import base64
|
10 |
-
from collections import Counter
|
11 |
import logging
|
12 |
-
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
13 |
-
from fastapi.responses import JSONResponse
|
14 |
-
import uvicorn
|
15 |
-
import pandas as pd
|
16 |
-
import traceback
|
17 |
import os
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
#
|
20 |
-
logging.basicConfig(
|
|
|
|
|
|
|
21 |
logger = logging.getLogger(__name__)
|
22 |
|
23 |
-
#
|
24 |
-
CONFIDENCE_THRESHOLD = 0.5
|
25 |
-
VALID_MODELS = [
|
26 |
"facebook/detr-resnet-50",
|
27 |
"facebook/detr-resnet-101",
|
28 |
"facebook/detr-resnet-50-panoptic",
|
29 |
"facebook/detr-resnet-101-panoptic",
|
30 |
"hustvl/yolos-tiny",
|
31 |
-
"hustvl/yolos-base"
|
32 |
]
|
33 |
-
MODEL_DESCRIPTIONS = {
|
34 |
-
"facebook/detr-resnet-50":
|
35 |
-
|
36 |
-
|
37 |
-
"facebook/detr-resnet-101
|
38 |
-
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
}
|
41 |
|
42 |
-
#
|
43 |
-
|
44 |
-
|
|
|
|
|
45 |
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
|
51 |
|
52 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
if model_name not in models:
|
54 |
logger.info(f"Loading model: {model_name}")
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
with torch.no_grad():
|
69 |
outputs = model(**inputs)
|
70 |
|
71 |
-
|
72 |
draw = ImageDraw.Draw(image)
|
73 |
-
object_names = []
|
74 |
-
confidence_scores = []
|
75 |
object_counter = Counter()
|
|
|
76 |
|
|
|
77 |
if "panoptic" in model_name:
|
78 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
79 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
@@ -83,6 +170,7 @@ def process(image, model_name):
|
|
83 |
label_name = model.config.id2label.get(label, "Unknown")
|
84 |
score = segment.get("score", 1.0)
|
85 |
|
|
|
86 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
87 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
88 |
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
@@ -106,7 +194,6 @@ def process(image, model_name):
|
|
106 |
x, y, x2, y2 = box.tolist()
|
107 |
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
108 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
109 |
-
# Place text at top-right corner, outside the box, with smaller size
|
110 |
text = f"{label_name}: {score:.2f}"
|
111 |
text_bbox = draw.textbbox((0, 0), text)
|
112 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
@@ -115,58 +202,82 @@ def process(image, model_name):
|
|
115 |
confidence_scores.append(float(score))
|
116 |
object_counter[label_name] = float(score)
|
117 |
|
|
|
118 |
unique_objects = list(object_counter.keys())
|
119 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
120 |
|
121 |
-
#
|
122 |
-
|
123 |
-
if hasattr(image, "fp") and image.fp is not None:
|
124 |
-
buffered = BytesIO()
|
125 |
-
image.save(buffered, format="PNG")
|
126 |
-
file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
127 |
-
|
128 |
-
# Color statistics
|
129 |
-
try:
|
130 |
-
stat = ImageStat.Stat(image)
|
131 |
-
color_stats = {
|
132 |
-
"mean": [f"{m:.2f}" for m in stat.mean],
|
133 |
-
"stddev": [f"{s:.2f}" for s in stat.stddev]
|
134 |
-
}
|
135 |
-
except Exception as e:
|
136 |
-
logger.error(f"Error calculating color statistics: {str(e)}")
|
137 |
-
color_stats = {"mean": "Error", "stddev": "Error"}
|
138 |
-
|
139 |
-
properties = {
|
140 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
141 |
"Size": f"{image.width}x{image.height}",
|
142 |
"Width": f"{image.width} px",
|
143 |
"Height": f"{image.height} px",
|
144 |
"Mode": image.mode,
|
145 |
-
"Aspect Ratio":
|
146 |
-
|
147 |
-
|
148 |
-
"
|
|
|
|
|
149 |
}
|
150 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
151 |
return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
|
|
|
152 |
except Exception as e:
|
153 |
logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
|
154 |
-
raise
|
155 |
|
|
|
156 |
# FastAPI Setup
|
|
|
|
|
157 |
app = FastAPI(title="Object Detection API")
|
158 |
|
159 |
@app.post("/detect")
|
160 |
async def detect_objects_endpoint(
|
161 |
-
file: UploadFile = File(None),
|
162 |
-
image_url: str = Form(None),
|
163 |
-
model_name: str = Form(VALID_MODELS[0])
|
164 |
-
):
|
165 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
try:
|
|
|
167 |
if (file is None and not image_url) or (file is not None and image_url):
|
168 |
-
raise HTTPException(
|
|
|
|
|
|
|
169 |
|
|
|
170 |
if file:
|
171 |
if not file.content_type.startswith("image/"):
|
172 |
raise HTTPException(status_code=400, detail="File must be an image")
|
@@ -178,207 +289,454 @@ async def detect_objects_endpoint(
|
|
178 |
image = Image.open(BytesIO(response.content)).convert("RGB")
|
179 |
|
180 |
if model_name not in VALID_MODELS:
|
181 |
-
raise HTTPException(
|
|
|
|
|
|
|
182 |
|
183 |
-
|
|
|
|
|
|
|
184 |
|
|
|
185 |
buffered = BytesIO()
|
186 |
detected_image.save(buffered, format="PNG")
|
187 |
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
188 |
img_url = f"data:image/png;base64,{img_base64}"
|
189 |
|
190 |
-
return JSONResponse(
|
191 |
-
|
192 |
-
|
193 |
-
|
194 |
-
|
195 |
-
|
196 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
except Exception as e:
|
198 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
199 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
200 |
|
201 |
-
#
|
202 |
-
|
203 |
-
|
204 |
-
|
205 |
-
|
206 |
-
|
207 |
-
|
208 |
-
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
|
224 |
-
|
225 |
-
|
226 |
-
|
227 |
-
|
228 |
-
|
229 |
-
|
230 |
-
|
231 |
-
|
232 |
-
|
233 |
-
|
234 |
-
|
235 |
-
|
236 |
-
|
237 |
-
|
238 |
-
|
239 |
-
|
240 |
-
|
241 |
-
|
242 |
-
|
243 |
-
|
244 |
-
|
245 |
-
|
246 |
-
|
247 |
-
|
248 |
-
|
249 |
-
|
250 |
-
|
251 |
-
|
252 |
-
|
253 |
-
|
254 |
-
|
255 |
-
|
256 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
257 |
interactive=False,
|
258 |
-
value=None
|
259 |
)
|
260 |
-
|
261 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
262 |
interactive=False,
|
263 |
-
value=None
|
264 |
)
|
265 |
-
|
266 |
-
|
267 |
-
|
268 |
-
|
269 |
-
|
270 |
-
|
271 |
-
|
272 |
-
|
273 |
-
|
274 |
-
|
275 |
-
|
276 |
-
|
277 |
-
|
278 |
-
|
279 |
-
|
280 |
-
|
281 |
-
image
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
282 |
|
283 |
-
|
284 |
-
|
285 |
-
|
286 |
-
|
287 |
-
|
288 |
-
|
289 |
-
|
290 |
-
|
291 |
-
|
292 |
-
|
293 |
-
|
294 |
-
|
295 |
-
|
296 |
-
|
297 |
-
|
298 |
-
|
299 |
-
|
300 |
-
|
301 |
-
|
302 |
-
|
303 |
-
|
304 |
-
|
305 |
-
|
306 |
-
|
307 |
-
|
308 |
-
|
309 |
-
|
310 |
-
|
311 |
-
|
312 |
-
|
313 |
-
|
314 |
-
|
315 |
-
|
316 |
-
|
317 |
-
|
318 |
-
|
319 |
-
|
320 |
-
|
321 |
-
|
322 |
-
|
323 |
-
|
324 |
-
|
325 |
-
|
326 |
-
|
327 |
-
|
328 |
-
|
329 |
-
|
330 |
-
|
331 |
-
|
332 |
-
|
333 |
-
|
334 |
-
|
335 |
-
|
336 |
-
|
337 |
-
|
338 |
-
|
339 |
-
|
340 |
-
|
341 |
-
|
342 |
-
|
343 |
-
|
344 |
-
|
345 |
-
|
346 |
-
|
347 |
-
|
348 |
-
|
349 |
-
|
350 |
-
|
351 |
-
|
352 |
-
|
353 |
-
|
354 |
-
|
355 |
-
|
356 |
-
|
357 |
-
|
358 |
-
|
359 |
-
|
360 |
-
|
361 |
-
|
362 |
-
|
363 |
-
|
364 |
-
|
365 |
-
|
366 |
-
|
367 |
-
|
368 |
-
|
369 |
-
|
370 |
-
|
371 |
-
|
372 |
-
|
373 |
-
|
374 |
-
|
375 |
-
|
376 |
-
|
377 |
-
|
378 |
-
|
379 |
-
return
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
380 |
|
381 |
if __name__ == "__main__":
|
382 |
-
|
383 |
-
demo.launch()
|
384 |
-
# To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
|
|
|
1 |
+
import argparse
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import base64
|
|
|
3 |
import logging
|
|
|
|
|
|
|
|
|
|
|
4 |
import os
|
5 |
+
import sys
|
6 |
+
import traceback
|
7 |
+
import threading
|
8 |
+
from collections import Counter
|
9 |
+
from io import BytesIO
|
10 |
+
from typing import Dict, List, Optional, Tuple
|
11 |
+
|
12 |
+
import gradio as gr
|
13 |
+
import pandas as pd
|
14 |
+
import requests
|
15 |
+
import torch
|
16 |
+
import uvicorn
|
17 |
+
from fastapi import FastAPI, File, Form, HTTPException, UploadFile
|
18 |
+
from fastapi.responses import JSONResponse
|
19 |
+
from PIL import Image, ImageDraw, ImageStat
|
20 |
+
from transformers import (
|
21 |
+
DetrForObjectDetection,
|
22 |
+
DetrForSegmentation,
|
23 |
+
DetrImageProcessor,
|
24 |
+
YolosForObjectDetection,
|
25 |
+
YolosImageProcessor,
|
26 |
+
)
|
27 |
+
import nest_asyncio
|
28 |
+
|
29 |
+
# ------------------------------
|
30 |
+
# Configuration
|
31 |
+
# ------------------------------
|
32 |
|
33 |
+
# Logging configuration
|
34 |
+
logging.basicConfig(
|
35 |
+
level=logging.INFO,
|
36 |
+
format="%(asctime)s - %(levelname)s - %(message)s",
|
37 |
+
)
|
38 |
logger = logging.getLogger(__name__)
|
39 |
|
40 |
+
# Model and processing constants
|
41 |
+
CONFIDENCE_THRESHOLD: float = 0.5
|
42 |
+
VALID_MODELS: List[str] = [
|
43 |
"facebook/detr-resnet-50",
|
44 |
"facebook/detr-resnet-101",
|
45 |
"facebook/detr-resnet-50-panoptic",
|
46 |
"facebook/detr-resnet-101-panoptic",
|
47 |
"hustvl/yolos-tiny",
|
48 |
+
"hustvl/yolos-base",
|
49 |
]
|
50 |
+
MODEL_DESCRIPTIONS: Dict[str, str] = {
|
51 |
+
"facebook/detr-resnet-50": (
|
52 |
+
"DETR with ResNet-50 backbone for object detection. Fast and accurate for general use."
|
53 |
+
),
|
54 |
+
"facebook/detr-resnet-101": (
|
55 |
+
"DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50."
|
56 |
+
),
|
57 |
+
"facebook/detr-resnet-50-panoptic": (
|
58 |
+
"DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes."
|
59 |
+
),
|
60 |
+
"facebook/detr-resnet-101-panoptic": (
|
61 |
+
"DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes."
|
62 |
+
),
|
63 |
+
"hustvl/yolos-tiny": (
|
64 |
+
"YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments."
|
65 |
+
),
|
66 |
+
"hustvl/yolos-base": (
|
67 |
+
"YOLOS Base model. Balances speed and accuracy for object detection."
|
68 |
+
),
|
69 |
}
|
70 |
|
71 |
+
# Port configuration
|
72 |
+
DEFAULT_GRADIO_PORT: int = 7860
|
73 |
+
DEFAULT_FASTAPI_PORT: int = 8000
|
74 |
+
PORT_RANGE: range = range(7860, 7870) # Try ports 7860-7869
|
75 |
+
MAX_PORT_ATTEMPTS: int = 10
|
76 |
|
77 |
+
# Thread-safe storage for lazy-loaded models and processors
|
78 |
+
models: Dict[str, any] = {}
|
79 |
+
processors: Dict[str, any] = {}
|
80 |
+
model_lock = threading.Lock()
|
|
|
81 |
|
82 |
+
# ------------------------------
|
83 |
+
# Model Loading
|
84 |
+
# ------------------------------
|
85 |
+
|
86 |
+
def load_model_and_processor(model_name: str) -> Tuple[any, any]:
|
87 |
+
"""
|
88 |
+
Load and cache the specified model and processor thread-safely.
|
89 |
+
|
90 |
+
Args:
|
91 |
+
model_name: Name of the model to load (must be in VALID_MODELS).
|
92 |
+
|
93 |
+
Returns:
|
94 |
+
Tuple containing the loaded model and processor.
|
95 |
+
|
96 |
+
Raises:
|
97 |
+
ValueError: If the model_name is invalid or loading fails.
|
98 |
+
"""
|
99 |
+
with model_lock:
|
100 |
if model_name not in models:
|
101 |
logger.info(f"Loading model: {model_name}")
|
102 |
+
try:
|
103 |
+
if "yolos" in model_name:
|
104 |
+
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
105 |
+
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
106 |
+
elif "panoptic" in model_name:
|
107 |
+
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
108 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
109 |
+
else:
|
110 |
+
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
111 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
112 |
+
logger.debug(f"Model {model_name} loaded successfully")
|
113 |
+
except Exception as e:
|
114 |
+
logger.error(f"Failed to load model {model_name}: {str(e)}")
|
115 |
+
raise ValueError(f"Failed to load model: {str(e)}")
|
116 |
+
return models[model_name], processors[model_name]
|
117 |
+
|
118 |
+
# ------------------------------
|
119 |
+
# Image Processing
|
120 |
+
# ------------------------------
|
121 |
+
|
122 |
+
def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str], List[float], List[str], List[float], Dict[str, str]]:
|
123 |
+
"""
|
124 |
+
Process an image for object detection or panoptic segmentation.
|
125 |
+
|
126 |
+
Args:
|
127 |
+
image: PIL Image to process.
|
128 |
+
model_name: Name of the model to use (must be in VALID_MODELS).
|
129 |
+
|
130 |
+
Returns:
|
131 |
+
Tuple containing:
|
132 |
+
- Annotated image (PIL Image).
|
133 |
+
- List of detected object names.
|
134 |
+
- List of confidence scores for detected objects.
|
135 |
+
- List of unique object names.
|
136 |
+
- List of confidence scores for unique objects.
|
137 |
+
- Dictionary of image properties (format, size, etc.).
|
138 |
|
139 |
+
Raises:
|
140 |
+
ValueError: If the model_name is invalid.
|
141 |
+
RuntimeError: If processing fails due to model or image issues.
|
142 |
+
"""
|
143 |
+
if model_name not in VALID_MODELS:
|
144 |
+
raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
|
145 |
+
|
146 |
+
try:
|
147 |
+
# Load model and processor
|
148 |
+
model, processor = load_model_and_processor(model_name)
|
149 |
+
logger.debug(f"Processing image with model: {model_name}")
|
150 |
+
|
151 |
+
# Prepare image for processing
|
152 |
+
inputs = processor(images=image, return_tensors="pt")
|
153 |
with torch.no_grad():
|
154 |
outputs = model(**inputs)
|
155 |
|
156 |
+
# Initialize drawing context
|
157 |
draw = ImageDraw.Draw(image)
|
158 |
+
object_names: List[str] = []
|
159 |
+
confidence_scores: List[float] = []
|
160 |
object_counter = Counter()
|
161 |
+
target_sizes = torch.tensor([image.size[::-1]])
|
162 |
|
163 |
+
# Process panoptic segmentation or object detection
|
164 |
if "panoptic" in model_name:
|
165 |
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
166 |
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
|
|
170 |
label_name = model.config.id2label.get(label, "Unknown")
|
171 |
score = segment.get("score", 1.0)
|
172 |
|
173 |
+
# Apply segmentation mask if available
|
174 |
if "masks" in results and segment["id"] < len(results["masks"]):
|
175 |
mask = results["masks"][segment["id"]].cpu().numpy()
|
176 |
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
|
|
194 |
x, y, x2, y2 = box.tolist()
|
195 |
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
196 |
label_name = model.config.id2label.get(label.item(), "Unknown")
|
|
|
197 |
text = f"{label_name}: {score:.2f}"
|
198 |
text_bbox = draw.textbbox((0, 0), text)
|
199 |
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
|
|
202 |
confidence_scores.append(float(score))
|
203 |
object_counter[label_name] = float(score)
|
204 |
|
205 |
+
# Compile unique objects and confidences
|
206 |
unique_objects = list(object_counter.keys())
|
207 |
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
208 |
|
209 |
+
# Calculate image properties
|
210 |
+
properties: Dict[str, str] = {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
211 |
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
212 |
"Size": f"{image.width}x{image.height}",
|
213 |
"Width": f"{image.width} px",
|
214 |
"Height": f"{image.height} px",
|
215 |
"Mode": image.mode,
|
216 |
+
"Aspect Ratio": (
|
217 |
+
f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined"
|
218 |
+
),
|
219 |
+
"File Size": "Unknown",
|
220 |
+
"Mean (R,G,B)": "Unknown",
|
221 |
+
"StdDev (R,G,B)": "Unknown",
|
222 |
}
|
223 |
|
224 |
+
# Compute file size
|
225 |
+
try:
|
226 |
+
buffered = BytesIO()
|
227 |
+
image.save(buffered, format="PNG")
|
228 |
+
properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
229 |
+
except Exception as e:
|
230 |
+
logger.error(f"Error calculating file size: {str(e)}")
|
231 |
+
|
232 |
+
# Compute color statistics
|
233 |
+
try:
|
234 |
+
stat = ImageStat.Stat(image)
|
235 |
+
properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
|
236 |
+
properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
|
237 |
+
except Exception as e:
|
238 |
+
logger.error(f"Error calculating color statistics: {str(e)}")
|
239 |
+
|
240 |
return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
|
241 |
+
|
242 |
except Exception as e:
|
243 |
logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
|
244 |
+
raise RuntimeError(f"Failed to process image: {str(e)}")
|
245 |
|
246 |
+
# ------------------------------
|
247 |
# FastAPI Setup
|
248 |
+
# ------------------------------
|
249 |
+
|
250 |
app = FastAPI(title="Object Detection API")
|
251 |
|
252 |
@app.post("/detect")
|
253 |
async def detect_objects_endpoint(
|
254 |
+
file: Optional[UploadFile] = File(None),
|
255 |
+
image_url: Optional[str] = Form(None),
|
256 |
+
model_name: str = Form(VALID_MODELS[0]),
|
257 |
+
) -> JSONResponse:
|
258 |
+
"""
|
259 |
+
FastAPI endpoint to detect objects in an image from file upload or URL.
|
260 |
+
|
261 |
+
Args:
|
262 |
+
file: Uploaded image file (optional).
|
263 |
+
image_url: URL of the image (optional).
|
264 |
+
model_name: Model to use for detection (default: first VALID_MODELS entry).
|
265 |
+
|
266 |
+
Returns:
|
267 |
+
JSONResponse containing the processed image (base64), detected objects, and confidences.
|
268 |
+
|
269 |
+
Raises:
|
270 |
+
HTTPException: If input validation fails or processing errors occur.
|
271 |
+
"""
|
272 |
try:
|
273 |
+
# Validate input
|
274 |
if (file is None and not image_url) or (file is not None and image_url):
|
275 |
+
raise HTTPException(
|
276 |
+
status_code=400,
|
277 |
+
detail="Provide either an image file or an image URL, not both.",
|
278 |
+
)
|
279 |
|
280 |
+
# Load image
|
281 |
if file:
|
282 |
if not file.content_type.startswith("image/"):
|
283 |
raise HTTPException(status_code=400, detail="File must be an image")
|
|
|
289 |
image = Image.open(BytesIO(response.content)).convert("RGB")
|
290 |
|
291 |
if model_name not in VALID_MODELS:
|
292 |
+
raise HTTPException(
|
293 |
+
status_code=400,
|
294 |
+
detail=f"Invalid model. Choose from: {VALID_MODELS}",
|
295 |
+
)
|
296 |
|
297 |
+
# Process image
|
298 |
+
detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(
|
299 |
+
image, model_name
|
300 |
+
)
|
301 |
|
302 |
+
# Encode image as base64
|
303 |
buffered = BytesIO()
|
304 |
detected_image.save(buffered, format="PNG")
|
305 |
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
306 |
img_url = f"data:image/png;base64,{img_base64}"
|
307 |
|
308 |
+
return JSONResponse(
|
309 |
+
content={
|
310 |
+
"image_url": img_url,
|
311 |
+
"detected_objects": detected_objects,
|
312 |
+
"confidence_scores": detected_confidences,
|
313 |
+
"unique_objects": unique_objects,
|
314 |
+
"unique_confidence_scores": unique_confidences,
|
315 |
+
}
|
316 |
+
)
|
317 |
+
|
318 |
+
except requests.RequestException as e:
|
319 |
+
logger.error(f"Error fetching image from URL: {str(e)}")
|
320 |
+
raise HTTPException(status_code=400, detail=f"Failed to fetch image: {str(e)}")
|
321 |
except Exception as e:
|
322 |
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
323 |
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
324 |
|
325 |
+
# ------------------------------
|
326 |
+
# Gradio UI Setup
|
327 |
+
# ------------------------------
|
328 |
+
|
329 |
+
def create_gradio_ui() -> gr.Blocks:
|
330 |
+
"""
|
331 |
+
Create and configure the Gradio UI for object detection.
|
332 |
+
|
333 |
+
Returns:
|
334 |
+
Gradio Blocks object representing the UI.
|
335 |
+
|
336 |
+
Raises:
|
337 |
+
RuntimeError: If UI creation fails.
|
338 |
+
"""
|
339 |
+
try:
|
340 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as app:
|
341 |
+
gr.Markdown(
|
342 |
+
f"""
|
343 |
+
# 🚀 Object Detection App
|
344 |
+
Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
|
345 |
+
Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
|
346 |
+
"""
|
347 |
+
)
|
348 |
+
|
349 |
+
with gr.Tabs():
|
350 |
+
with gr.Tab("📷 Image Upload"):
|
351 |
+
with gr.Row():
|
352 |
+
with gr.Column(scale=1):
|
353 |
+
gr.Markdown("### Input")
|
354 |
+
model_choice = gr.Dropdown(
|
355 |
+
choices=VALID_MODELS,
|
356 |
+
value=VALID_MODELS[0],
|
357 |
+
label="🔎 Select Model",
|
358 |
+
info="Choose a model for object detection or panoptic segmentation.",
|
359 |
+
)
|
360 |
+
model_info = gr.Markdown(
|
361 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
362 |
+
visible=True,
|
363 |
+
)
|
364 |
+
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
365 |
+
image_url_input = gr.Textbox(
|
366 |
+
label="🔗 Image URL",
|
367 |
+
placeholder="https://example.com/image.jpg",
|
368 |
+
)
|
369 |
+
with gr.Row():
|
370 |
+
submit_btn = gr.Button("✨ Detect", variant="primary")
|
371 |
+
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
372 |
+
|
373 |
+
model_choice.change(
|
374 |
+
fn=lambda model_name: (
|
375 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
|
376 |
+
),
|
377 |
+
inputs=model_choice,
|
378 |
+
outputs=model_info,
|
379 |
+
)
|
380 |
+
|
381 |
+
with gr.Column(scale=2):
|
382 |
+
gr.Markdown("### Results")
|
383 |
+
error_output = gr.Textbox(
|
384 |
+
label="⚠️ Errors",
|
385 |
+
visible=False,
|
386 |
+
lines=3,
|
387 |
+
max_lines=5,
|
388 |
+
)
|
389 |
+
output_image = gr.Image(
|
390 |
+
type="pil",
|
391 |
+
label="🎯 Detected Image",
|
392 |
interactive=False,
|
|
|
393 |
)
|
394 |
+
with gr.Row():
|
395 |
+
objects_output = gr.DataFrame(
|
396 |
+
label="📋 Detected Objects",
|
397 |
+
interactive=False,
|
398 |
+
value=None,
|
399 |
+
)
|
400 |
+
unique_objects_output = gr.DataFrame(
|
401 |
+
label="🔍 Unique Objects",
|
402 |
+
interactive=False,
|
403 |
+
value=None,
|
404 |
+
)
|
405 |
+
properties_output = gr.DataFrame(
|
406 |
+
label="📄 Image Properties",
|
407 |
interactive=False,
|
408 |
+
value=None,
|
409 |
)
|
410 |
+
|
411 |
+
def process_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Tuple[
|
412 |
+
Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str
|
413 |
+
]:
|
414 |
+
"""
|
415 |
+
Process image for Gradio UI and return results.
|
416 |
+
|
417 |
+
Args:
|
418 |
+
image: Uploaded PIL Image (optional).
|
419 |
+
url: Image URL (optional).
|
420 |
+
model_name: Model to use for detection.
|
421 |
+
|
422 |
+
Returns:
|
423 |
+
Tuple of detected image, objects DataFrame, unique objects DataFrame, properties DataFrame, and error message.
|
424 |
+
"""
|
425 |
+
try:
|
426 |
+
if image is None and not url:
|
427 |
+
return None, None, None, None, "Please provide an image or URL"
|
428 |
+
if image and url:
|
429 |
+
return None, None, None, None, "Please provide either an image or URL, not both"
|
430 |
+
|
431 |
+
if url:
|
432 |
+
response = requests.get(url, timeout=10)
|
433 |
+
response.raise_for_status()
|
434 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
435 |
+
|
436 |
+
detected_image, objects, scores, unique_objects, unique_scores, properties = process(
|
437 |
+
image, model_name
|
438 |
+
)
|
439 |
+
objects_df = (
|
440 |
+
pd.DataFrame(
|
441 |
+
{
|
442 |
+
"Object": objects,
|
443 |
+
"Confidence Score": [f"{score:.2f}" for score in scores],
|
444 |
+
}
|
445 |
+
)
|
446 |
+
if objects
|
447 |
+
else pd.DataFrame(columns=["Object", "Confidence Score"])
|
448 |
+
)
|
449 |
+
unique_objects_df = (
|
450 |
+
pd.DataFrame(
|
451 |
+
{
|
452 |
+
"Unique Object": unique_objects,
|
453 |
+
"Confidence Score": [f"{score:.2f}" for score in unique_scores],
|
454 |
+
}
|
455 |
+
)
|
456 |
+
if unique_objects
|
457 |
+
else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
458 |
+
)
|
459 |
+
properties_df = (
|
460 |
+
pd.DataFrame([properties])
|
461 |
+
if properties
|
462 |
+
else pd.DataFrame(columns=properties.keys())
|
463 |
+
)
|
464 |
+
return detected_image, objects_df, unique_objects_df, properties_df, ""
|
465 |
+
|
466 |
+
except requests.RequestException as e:
|
467 |
+
error_msg = f"Error fetching image from URL: {str(e)}"
|
468 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
469 |
+
return None, None, None, None, error_msg
|
470 |
+
except Exception as e:
|
471 |
+
error_msg = f"Error processing image: {str(e)}"
|
472 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
473 |
+
return None, None, None, None, error_msg
|
474 |
+
|
475 |
+
submit_btn.click(
|
476 |
+
fn=process_for_gradio,
|
477 |
+
inputs=[image_input, image_url_input, model_choice],
|
478 |
+
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
|
479 |
+
)
|
480 |
+
|
481 |
+
clear_btn.click(
|
482 |
+
fn=lambda: [None, "", None, None, None, None],
|
483 |
+
inputs=None,
|
484 |
+
outputs=[
|
485 |
+
image_input,
|
486 |
+
image_url_input,
|
487 |
+
output_image,
|
488 |
+
objects_output,
|
489 |
+
unique_objects_output,
|
490 |
+
properties_output,
|
491 |
+
error_output,
|
492 |
+
],
|
493 |
+
)
|
494 |
+
|
495 |
+
with gr.Tab("🔗 JSON Output"):
|
496 |
+
gr.Markdown("### Process Image for JSON Output")
|
497 |
+
image_input_json = gr.Image(type="pil", label="📷 Upload Image")
|
498 |
+
image_url_input_json = gr.Textbox(
|
499 |
+
label="🔗 Image URL",
|
500 |
+
placeholder="https://example.com/image.jpg",
|
501 |
+
)
|
502 |
+
url_model_choice = gr.Dropdown(
|
503 |
+
choices=VALID_MODELS,
|
504 |
+
value=VALID_MODELS[0],
|
505 |
+
label="🔎 Select Model",
|
506 |
+
)
|
507 |
+
url_model_info = gr.Markdown(
|
508 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
509 |
+
visible=True,
|
510 |
+
)
|
511 |
+
url_submit_btn = gr.Button("🔄 Process", variant="primary")
|
512 |
+
url_output = gr.JSON(label="API Response")
|
513 |
+
|
514 |
+
url_model_choice.change(
|
515 |
+
fn=lambda model_name: (
|
516 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
|
517 |
+
),
|
518 |
+
inputs=url_model_choice,
|
519 |
+
outputs=url_model_info,
|
520 |
+
)
|
521 |
+
|
522 |
+
def process_url_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Dict:
|
523 |
+
"""
|
524 |
+
Process image from file or URL for Gradio UI and return JSON response.
|
525 |
+
|
526 |
+
Args:
|
527 |
+
image: Uploaded PIL Image (optional).
|
528 |
+
url: Image URL (optional).
|
529 |
+
model_name: Model to use for detection.
|
530 |
+
|
531 |
+
Returns:
|
532 |
+
Dictionary with processed image (base64), detected objects, and confidences.
|
533 |
+
"""
|
534 |
+
try:
|
535 |
+
if image is None and not url:
|
536 |
+
return {"error": "Please provide an image or URL"}
|
537 |
+
if image and url:
|
538 |
+
return {"error": "Please provide either an image or URL, not both"}
|
539 |
+
|
540 |
+
if url:
|
541 |
+
response = requests.get(url, timeout=10)
|
542 |
+
response.raise_for_status()
|
543 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
544 |
+
|
545 |
+
detected_image, objects, scores, unique_objects, unique_scores, _ = process(
|
546 |
+
image, model_name
|
547 |
+
)
|
548 |
+
buffered = BytesIO()
|
549 |
+
detected_image.save(buffered, format="PNG")
|
550 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
551 |
+
return {
|
552 |
+
"image_url": f"data:image/png;base64,{img_base64}",
|
553 |
+
"detected_objects": objects,
|
554 |
+
"confidence_scores": scores,
|
555 |
+
"unique_objects": unique_objects,
|
556 |
+
"unique_confidence_scores": unique_scores,
|
557 |
+
}
|
558 |
+
except requests.RequestException as e:
|
559 |
+
error_msg = f"Error fetching image from URL: {str(e)}"
|
560 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
561 |
+
return {"error": error_msg}
|
562 |
+
except Exception as e:
|
563 |
+
error_msg = f"Error processing image: {str(e)}"
|
564 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
565 |
+
return {"error": error_msg}
|
566 |
+
|
567 |
+
url_submit_btn.click(
|
568 |
+
fn=process_url_for_gradio,
|
569 |
+
inputs=[image_input_json, image_url_input_json, url_model_choice],
|
570 |
+
outputs=[url_output],
|
571 |
+
)
|
572 |
+
|
573 |
+
with gr.Tab("ℹ️ Help"):
|
574 |
+
gr.Markdown(
|
575 |
+
"""
|
576 |
+
## How to Use
|
577 |
+
- **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
|
578 |
+
- **JSON Output**: Upload an image or enter a URL, select a model, and click "Process" to get results in JSON format.
|
579 |
+
- **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
|
580 |
+
- **Clear**: Reset all inputs and outputs using the "Clear" button in the Image Upload tab.
|
581 |
+
- **Errors**: Check the error box (Image Upload) or JSON response (JSON Output) for issues.
|
582 |
|
583 |
+
## Tips
|
584 |
+
- Use high-quality images for better detection results.
|
585 |
+
- Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
|
586 |
+
- For faster processing, try YOLOS-Tiny on resource-constrained devices.
|
587 |
+
"""
|
588 |
+
)
|
589 |
+
|
590 |
+
return app
|
591 |
+
|
592 |
+
except Exception as e:
|
593 |
+
logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
|
594 |
+
raise RuntimeError(f"Failed to create Gradio UI: {str(e)}")
|
595 |
+
|
596 |
+
# ------------------------------
|
597 |
+
# Launcher
|
598 |
+
# ------------------------------
|
599 |
+
|
600 |
+
def parse_args() -> argparse.Namespace:
|
601 |
+
"""
|
602 |
+
Parse command-line arguments with defaults and ignore unrecognized arguments.
|
603 |
+
|
604 |
+
Returns:
|
605 |
+
Parsed arguments as a Namespace object.
|
606 |
+
|
607 |
+
Raises:
|
608 |
+
SystemExit: If argument parsing fails (handled by argparse).
|
609 |
+
"""
|
610 |
+
parser = argparse.ArgumentParser(
|
611 |
+
description="Launcher for Object Detection App with Gradio UI and optional FastAPI server."
|
612 |
+
)
|
613 |
+
parser.add_argument(
|
614 |
+
"--gradio-port",
|
615 |
+
type=int,
|
616 |
+
default=DEFAULT_GRADIO_PORT,
|
617 |
+
help=f"Port for the Gradio UI (default: {DEFAULT_GRADIO_PORT}).",
|
618 |
+
)
|
619 |
+
parser.add_argument(
|
620 |
+
"--enable-fastapi",
|
621 |
+
action="store_true",
|
622 |
+
default=False,
|
623 |
+
help="Enable the FastAPI server (disabled by default).",
|
624 |
+
)
|
625 |
+
parser.add_argument(
|
626 |
+
"--fastapi-port",
|
627 |
+
type=int,
|
628 |
+
default=DEFAULT_FASTAPI_PORT,
|
629 |
+
help=f"Port for the FastAPI server if enabled (default: {DEFAULT_FASTAPI_PORT}).",
|
630 |
+
)
|
631 |
+
|
632 |
+
# Parse known arguments and ignore unrecognized ones (e.g., Jupyter kernel args)
|
633 |
+
args, _ = parser.parse_known_args()
|
634 |
+
return args
|
635 |
+
|
636 |
+
def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
|
637 |
+
"""
|
638 |
+
Find an available port within the specified range.
|
639 |
+
|
640 |
+
Args:
|
641 |
+
start_port: Initial port to try (e.g., from args or environment).
|
642 |
+
port_range: Range of ports to attempt.
|
643 |
+
max_attempts: Maximum number of ports to try.
|
644 |
+
|
645 |
+
Returns:
|
646 |
+
Available port number, or None if no port is found.
|
647 |
+
|
648 |
+
Raises:
|
649 |
+
OSError: If port binding fails for reasons other than port in use.
|
650 |
+
"""
|
651 |
+
import socket
|
652 |
+
|
653 |
+
port = start_port
|
654 |
+
attempts = 0
|
655 |
+
|
656 |
+
# Check environment variable GRADIO_SERVER_PORT
|
657 |
+
env_port = os.getenv("GRADIO_SERVER_PORT")
|
658 |
+
if env_port and env_port.isdigit():
|
659 |
+
port = int(env_port)
|
660 |
+
logger.info(f"Using GRADIO_SERVER_PORT from environment: {port}")
|
661 |
+
|
662 |
+
while attempts < max_attempts:
|
663 |
+
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
664 |
+
try:
|
665 |
+
s.bind(("0.0.0.0", port))
|
666 |
+
logger.debug(f"Port {port} is available")
|
667 |
+
return port
|
668 |
+
except OSError as e:
|
669 |
+
if e.errno == 98: # Port in use
|
670 |
+
logger.debug(f"Port {port} is in use")
|
671 |
+
port = port + 1 if port < max(port_range) else min(port_range)
|
672 |
+
attempts += 1
|
673 |
+
else:
|
674 |
+
raise
|
675 |
+
except Exception as e:
|
676 |
+
logger.error(f"Error checking port {port}: {str(e)}")
|
677 |
+
raise
|
678 |
+
logger.error(f"No available port found in range {min(port_range)}-{max(port_range)} after {max_attempts} attempts")
|
679 |
+
return None
|
680 |
+
|
681 |
+
def run_fastapi_server(host: str, port: int) -> None:
|
682 |
+
"""
|
683 |
+
Run the FastAPI server using Uvicorn.
|
684 |
+
|
685 |
+
Args:
|
686 |
+
host: Host address for the FastAPI server.
|
687 |
+
port: Port for the FastAPI server.
|
688 |
+
"""
|
689 |
+
try:
|
690 |
+
uvicorn.run(app, host=host, port=port)
|
691 |
+
except Exception as e:
|
692 |
+
logger.error(f"Error running FastAPI server: {str(e)}\n{traceback.format_exc()}")
|
693 |
+
sys.exit(1)
|
694 |
+
|
695 |
+
def main() -> None:
|
696 |
+
"""
|
697 |
+
Main function to launch Gradio UI and optional FastAPI server.
|
698 |
+
|
699 |
+
Raises:
|
700 |
+
SystemExit: If the application is interrupted or encounters an error.
|
701 |
+
"""
|
702 |
+
try:
|
703 |
+
# Apply nest_asyncio to allow nested event loops in Jupyter/Colab
|
704 |
+
nest_asyncio.apply()
|
705 |
+
|
706 |
+
# Parse command-line arguments
|
707 |
+
args = parse_args()
|
708 |
+
logger.info(f"Parsed arguments: {args}")
|
709 |
+
|
710 |
+
# Find available port for Gradio
|
711 |
+
gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
|
712 |
+
if gradio_port is None:
|
713 |
+
logger.error("Failed to find an available port for Gradio UI")
|
714 |
+
sys.exit(1)
|
715 |
+
|
716 |
+
# Launch FastAPI server in a separate thread if enabled
|
717 |
+
if args.enable_fastapi:
|
718 |
+
logger.info(f"Starting FastAPI server on port {args.fastapi_port}")
|
719 |
+
fastapi_thread = threading.Thread(
|
720 |
+
target=run_fastapi_server,
|
721 |
+
args=("0.0.0.0", args.fastapi_port),
|
722 |
+
daemon=True
|
723 |
+
)
|
724 |
+
fastapi_thread.start()
|
725 |
+
|
726 |
+
# Launch Gradio UI
|
727 |
+
logger.info(f"Starting Gradio UI on port {gradio_port}")
|
728 |
+
app = create_gradio_ui()
|
729 |
+
app.launch(server_port=gradio_port, server_name="0.0.0.0")
|
730 |
+
|
731 |
+
except KeyboardInterrupt:
|
732 |
+
logger.info("Application terminated by user.")
|
733 |
+
sys.exit(0)
|
734 |
+
except OSError as e:
|
735 |
+
logger.error(f"Port binding error: {str(e)}")
|
736 |
+
sys.exit(1)
|
737 |
+
except Exception as e:
|
738 |
+
logger.error(f"Error running application: {str(e)}\n{traceback.format_exc()}")
|
739 |
+
sys.exit(1)
|
740 |
|
741 |
if __name__ == "__main__":
|
742 |
+
main()
|
|
|
|
hf_space/hf_space/README.md
CHANGED
@@ -1,54 +1,56 @@
|
|
1 |
# 🚀 Object Detection with Transformer Models
|
2 |
|
3 |
-
This project provides
|
4 |
|
5 |
-
|
6 |
|
7 |
## Models Supported
|
8 |
|
9 |
-
The following models
|
10 |
|
11 |
- **DETR (DEtection TRansformer)**:
|
12 |
-
- `facebook/detr-resnet-50`:
|
13 |
-
- `facebook/detr-resnet-101`:
|
14 |
-
- `facebook/detr-resnet-50-panoptic
|
15 |
-
- `facebook/detr-resnet-101-panoptic
|
16 |
|
17 |
- **YOLOS (You Only Look One-level Series)**:
|
18 |
-
- `hustvl/yolos-tiny`:
|
19 |
-
- `hustvl/yolos-base`:
|
20 |
|
21 |
## Features
|
22 |
|
23 |
-
- **Image Upload**: Upload images
|
24 |
-
- **URL Input**:
|
25 |
- **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
|
26 |
-
- **Object Detection**:
|
27 |
-
- **Panoptic Segmentation**:
|
28 |
-
- **Image Properties**: Displays
|
29 |
-
- **API Access**:
|
|
|
30 |
|
31 |
## How to Use
|
32 |
|
33 |
-
### 1. **
|
34 |
|
35 |
Follow these steps to set up the application locally:
|
36 |
|
37 |
#### Prerequisites
|
38 |
|
39 |
- Python 3.8 or higher
|
40 |
-
-
|
|
|
41 |
|
42 |
#### Clone the Repository
|
43 |
|
44 |
```bash
|
45 |
-
git clone https://github.com/NeerajCodz/ObjectDetection
|
46 |
cd ObjectDetection
|
47 |
```
|
48 |
|
49 |
#### Install Dependencies
|
50 |
|
51 |
-
Install
|
52 |
|
53 |
```bash
|
54 |
pip install -r requirements.txt
|
@@ -56,88 +58,150 @@ pip install -r requirements.txt
|
|
56 |
|
57 |
#### Run the Application
|
58 |
|
59 |
-
|
60 |
|
61 |
```bash
|
62 |
-
|
63 |
```
|
64 |
|
65 |
-
|
66 |
|
67 |
```bash
|
68 |
-
python app.py
|
69 |
```
|
70 |
|
71 |
#### Access the Application
|
72 |
|
73 |
-
-
|
74 |
-
-
|
75 |
|
76 |
### 2. **Running with Docker**
|
77 |
|
78 |
-
|
79 |
|
80 |
#### Prerequisites
|
81 |
|
82 |
-
- Docker installed on your machine.
|
83 |
|
84 |
-
####
|
85 |
|
86 |
-
|
87 |
|
88 |
```bash
|
89 |
-
|
90 |
-
cd ObjectDetection
|
91 |
```
|
92 |
|
93 |
-
|
|
|
|
|
94 |
|
95 |
```bash
|
96 |
-
docker
|
97 |
```
|
98 |
|
99 |
-
|
|
|
|
|
100 |
|
101 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
|
103 |
```bash
|
104 |
-
docker run -p
|
105 |
```
|
106 |
|
107 |
-
|
108 |
|
109 |
### 3. **Demo**
|
110 |
|
111 |
-
|
112 |
|
113 |
[Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
|
114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
115 |
## Using the API
|
116 |
|
117 |
-
|
118 |
|
119 |
-
|
120 |
|
121 |
-
|
122 |
|
123 |
-
|
124 |
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
|
129 |
-
|
130 |
|
131 |
-
```
|
132 |
-
|
133 |
-
"image_url": "https://example.com/image.jpg",
|
134 |
-
"model_name": "facebook/detr-resnet-50"
|
135 |
-
}
|
136 |
```
|
137 |
|
138 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
139 |
|
140 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
141 |
|
142 |
```json
|
143 |
{
|
@@ -149,14 +213,20 @@ The response includes a base64-encoded image with detections, detected objects,
|
|
149 |
}
|
150 |
```
|
151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
152 |
## Development Setup
|
153 |
|
154 |
-
|
155 |
|
156 |
1. Clone the repository:
|
157 |
|
158 |
```bash
|
159 |
-
git clone https://github.com/NeerajCodz/ObjectDetection
|
160 |
cd ObjectDetection
|
161 |
```
|
162 |
|
@@ -166,20 +236,37 @@ cd ObjectDetection
|
|
166 |
pip install -r requirements.txt
|
167 |
```
|
168 |
|
169 |
-
3. Run the
|
170 |
|
171 |
```bash
|
172 |
-
|
173 |
```
|
174 |
|
175 |
-
|
176 |
|
177 |
```bash
|
178 |
-
|
179 |
```
|
180 |
|
181 |
-
4.
|
182 |
|
183 |
## Contributing
|
184 |
|
185 |
-
Contributions are welcome!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# 🚀 Object Detection with Transformer Models
|
2 |
|
3 |
+
This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
|
4 |
|
5 |
+
Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
|
6 |
|
7 |
## Models Supported
|
8 |
|
9 |
+
The application supports the following models, each tailored for specific detection or segmentation tasks:
|
10 |
|
11 |
- **DETR (DEtection TRansformer)**:
|
12 |
+
- `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
|
13 |
+
- `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
|
14 |
+
- `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
|
15 |
+
- `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
|
16 |
|
17 |
- **YOLOS (You Only Look One-level Series)**:
|
18 |
+
- `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
|
19 |
+
- `hustvl/yolos-base`: Balances speed and accuracy for object detection.
|
20 |
|
21 |
## Features
|
22 |
|
23 |
+
- **Image Upload**: Upload images via the Gradio interface for object detection.
|
24 |
+
- **URL Input**: Provide image URLs for detection through the Gradio interface or API.
|
25 |
- **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
|
26 |
+
- **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
|
27 |
+
- **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
|
28 |
+
- **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
|
29 |
+
- **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
|
30 |
+
- **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
|
31 |
|
32 |
## How to Use
|
33 |
|
34 |
+
### 1. **Local Setup (Git Clone)**
|
35 |
|
36 |
Follow these steps to set up the application locally:
|
37 |
|
38 |
#### Prerequisites
|
39 |
|
40 |
- Python 3.8 or higher
|
41 |
+
- `pip` for installing dependencies
|
42 |
+
- Git for cloning the repository
|
43 |
|
44 |
#### Clone the Repository
|
45 |
|
46 |
```bash
|
47 |
+
git clone https://github.com/NeerajCodz/ObjectDetection
|
48 |
cd ObjectDetection
|
49 |
```
|
50 |
|
51 |
#### Install Dependencies
|
52 |
|
53 |
+
Install required packages from `requirements.txt`:
|
54 |
|
55 |
```bash
|
56 |
pip install -r requirements.txt
|
|
|
58 |
|
59 |
#### Run the Application
|
60 |
|
61 |
+
Launch the Gradio interface:
|
62 |
|
63 |
```bash
|
64 |
+
python app.py
|
65 |
```
|
66 |
|
67 |
+
To enable the FastAPI server:
|
68 |
|
69 |
```bash
|
70 |
+
python app.py --enable-fastapi
|
71 |
```
|
72 |
|
73 |
#### Access the Application
|
74 |
|
75 |
+
- **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
|
76 |
+
- **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
|
77 |
|
78 |
### 2. **Running with Docker**
|
79 |
|
80 |
+
Use Docker for a containerized setup.
|
81 |
|
82 |
#### Prerequisites
|
83 |
|
84 |
+
- Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
|
85 |
|
86 |
+
#### Pull the Docker Image
|
87 |
|
88 |
+
Pull the pre-built image from Docker Hub:
|
89 |
|
90 |
```bash
|
91 |
+
docker pull neerajcodz/objectdetection:latest
|
|
|
92 |
```
|
93 |
|
94 |
+
#### Run the Docker Container
|
95 |
+
|
96 |
+
Run the application on port 8080:
|
97 |
|
98 |
```bash
|
99 |
+
docker run -d -p 8080:80 neerajcodz/objectdetection:latest
|
100 |
```
|
101 |
|
102 |
+
Access the interface at `http://localhost:8080`.
|
103 |
+
|
104 |
+
#### Build and Run the Docker Image
|
105 |
|
106 |
+
To build the Docker image locally:
|
107 |
+
|
108 |
+
1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
|
109 |
+
2. Build the image:
|
110 |
+
|
111 |
+
```bash
|
112 |
+
docker build -t objectdetection:local .
|
113 |
+
```
|
114 |
+
|
115 |
+
3. Run the container:
|
116 |
|
117 |
```bash
|
118 |
+
docker run -d -p 8080:80 objectdetection:local
|
119 |
```
|
120 |
|
121 |
+
Access the interface at `http://localhost:8080`.
|
122 |
|
123 |
### 3. **Demo**
|
124 |
|
125 |
+
Try the demo on Hugging Face Spaces:
|
126 |
|
127 |
[Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
|
128 |
|
129 |
+
## Command-Line Arguments
|
130 |
+
|
131 |
+
The `app.py` script supports the following command-line arguments:
|
132 |
+
|
133 |
+
- `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
|
134 |
+
- Example: `python app.py --gradio-port 7870`
|
135 |
+
- `--enable-fastapi`: Enable the FastAPI server (disabled by default).
|
136 |
+
- Example: `python app.py --enable-fastapi`
|
137 |
+
- `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
|
138 |
+
- Example: `python app.py --enable-fastapi --fastapi-port 8001`
|
139 |
+
|
140 |
+
You can combine arguments:
|
141 |
+
|
142 |
+
```bash
|
143 |
+
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
|
144 |
+
```
|
145 |
+
|
146 |
+
Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
|
147 |
+
|
148 |
+
```bash
|
149 |
+
export GRADIO_SERVER_PORT=7870
|
150 |
+
python app.py
|
151 |
+
```
|
152 |
+
|
153 |
## Using the API
|
154 |
|
155 |
+
**Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
|
156 |
|
157 |
+
The `/detect` endpoint allows programmatic image processing.
|
158 |
|
159 |
+
### Running the FastAPI Server
|
160 |
|
161 |
+
Enable FastAPI when launching the script:
|
162 |
|
163 |
+
```bash
|
164 |
+
python app.py --enable-fastapi
|
165 |
+
```
|
166 |
|
167 |
+
Or run FastAPI separately with Uvicorn:
|
168 |
|
169 |
+
```bash
|
170 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
|
|
|
|
|
|
171 |
```
|
172 |
|
173 |
+
Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
|
174 |
+
|
175 |
+
### Endpoint Details
|
176 |
+
|
177 |
+
- **Endpoint**: `POST /detect`
|
178 |
+
- **Parameters**:
|
179 |
+
- `file`: (optional) Image file (must be `image/*` type).
|
180 |
+
- `image_url`: (optional) URL of the image.
|
181 |
+
- `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
|
182 |
+
- **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
|
183 |
+
|
184 |
+
### Example Requests
|
185 |
|
186 |
+
#### Using `curl` with an Image URL
|
187 |
+
|
188 |
+
```bash
|
189 |
+
curl -X POST "http://localhost:8000/detect" \
|
190 |
+
-H "Content-Type: application/json" \
|
191 |
+
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
|
192 |
+
```
|
193 |
+
|
194 |
+
#### Using `curl` with an Image File
|
195 |
+
|
196 |
+
```bash
|
197 |
+
curl -X POST "http://localhost:8000/detect" \
|
198 |
+
-F "file=@/path/to/image.jpg" \
|
199 |
+
-F "model_name=facebook/detr-resnet-50"
|
200 |
+
```
|
201 |
+
|
202 |
+
### Response Format
|
203 |
+
|
204 |
+
The response includes a base64-encoded image with detections and detection details:
|
205 |
|
206 |
```json
|
207 |
{
|
|
|
213 |
}
|
214 |
```
|
215 |
|
216 |
+
### Notes
|
217 |
+
|
218 |
+
- Ensure only one of `file` or `image_url` is provided.
|
219 |
+
- The API may experience instability with panoptic models; use object detection models for reliability.
|
220 |
+
- Test the API using the Swagger UI for easier debugging.
|
221 |
+
|
222 |
## Development Setup
|
223 |
|
224 |
+
To contribute or modify the application:
|
225 |
|
226 |
1. Clone the repository:
|
227 |
|
228 |
```bash
|
229 |
+
git clone https://github.com/NeerajCodz/ObjectDetection
|
230 |
cd ObjectDetection
|
231 |
```
|
232 |
|
|
|
236 |
pip install -r requirements.txt
|
237 |
```
|
238 |
|
239 |
+
3. Run the application:
|
240 |
|
241 |
```bash
|
242 |
+
python app.py
|
243 |
```
|
244 |
|
245 |
+
Or run FastAPI:
|
246 |
|
247 |
```bash
|
248 |
+
uvicorn objectdetection:app --host 0.0.0.0 --port 8000
|
249 |
```
|
250 |
|
251 |
+
4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
|
252 |
|
253 |
## Contributing
|
254 |
|
255 |
+
Contributions are welcome! To contribute:
|
256 |
+
|
257 |
+
1. Fork the repository.
|
258 |
+
2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
|
259 |
+
3. Commit changes (`git commit -m "Add your feature"`).
|
260 |
+
4. Push to the branch (`git push origin feature/your-feature`).
|
261 |
+
5. Open a pull request on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
|
262 |
+
|
263 |
+
Please include tests and documentation for new features. Report issues via GitHub Issues.
|
264 |
+
|
265 |
+
## Troubleshooting
|
266 |
+
|
267 |
+
- **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
|
268 |
+
- **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab.
|
269 |
+
- **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
|
270 |
+
- **API Instability**: Test with smaller images and object detection models first.
|
271 |
+
|
272 |
+
For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
|
hf_space/hf_space/hf_space/README.md
CHANGED
@@ -1,12 +1,185 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 🚀 Object Detection with Transformer Models
|
2 |
+
|
3 |
+
This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
|
4 |
+
|
5 |
+
You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
|
6 |
+
|
7 |
+
## Models Supported
|
8 |
+
|
9 |
+
The following models are supported, as defined in the application:
|
10 |
+
|
11 |
+
- **DETR (DEtection TRansformer)**:
|
12 |
+
- `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
|
13 |
+
- `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
|
14 |
+
- `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
|
15 |
+
- `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
|
16 |
+
|
17 |
+
- **YOLOS (You Only Look One-level Series)**:
|
18 |
+
- `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
|
19 |
+
- `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
|
20 |
+
|
21 |
+
## Features
|
22 |
+
|
23 |
+
- **Image Upload**: Upload images from your device for object detection via the Gradio interface.
|
24 |
+
- **URL Input**: Input an image URL for detection through the Gradio interface or API.
|
25 |
+
- **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
|
26 |
+
- **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
|
27 |
+
- **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
|
28 |
+
- **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
|
29 |
+
- **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
|
30 |
+
|
31 |
+
## How to Use
|
32 |
+
|
33 |
+
### 1. **Normal Git Clone Method**
|
34 |
+
|
35 |
+
Follow these steps to set up the application locally:
|
36 |
+
|
37 |
+
#### Prerequisites
|
38 |
+
|
39 |
+
- Python 3.8 or higher
|
40 |
+
- Install dependencies using `pip`
|
41 |
+
|
42 |
+
#### Clone the Repository
|
43 |
+
|
44 |
+
```bash
|
45 |
+
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
46 |
+
cd ObjectDetection
|
47 |
+
```
|
48 |
+
|
49 |
+
#### Install Dependencies
|
50 |
+
|
51 |
+
Install the required dependencies from `requirements.txt`:
|
52 |
+
|
53 |
+
```bash
|
54 |
+
pip install -r requirements.txt
|
55 |
+
```
|
56 |
+
|
57 |
+
#### Run the Application
|
58 |
+
|
59 |
+
Start the FastAPI server using uvicorn:
|
60 |
+
|
61 |
+
```bash
|
62 |
+
uvicorn objectdetection:app --reload
|
63 |
+
```
|
64 |
+
|
65 |
+
Alternatively, launch the Gradio interface by running the main script:
|
66 |
+
|
67 |
+
```bash
|
68 |
+
python app.py
|
69 |
+
```
|
70 |
+
|
71 |
+
#### Access the Application
|
72 |
+
|
73 |
+
- For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
|
74 |
+
- For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
|
75 |
+
|
76 |
+
### 2. **Running with Docker**
|
77 |
+
|
78 |
+
If you prefer to use Docker to set up and run the application, follow these steps:
|
79 |
+
|
80 |
+
#### Prerequisites
|
81 |
+
|
82 |
+
- Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
|
83 |
+
|
84 |
+
#### Build the Docker Image
|
85 |
+
|
86 |
+
First, clone the repository (if you haven't already):
|
87 |
+
|
88 |
+
```bash
|
89 |
+
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
90 |
+
cd ObjectDetection
|
91 |
+
```
|
92 |
+
|
93 |
+
Now, build the Docker image:
|
94 |
+
|
95 |
+
```bash
|
96 |
+
docker build -t objectdetection:latest .
|
97 |
+
```
|
98 |
+
|
99 |
+
#### Run the Docker Container
|
100 |
+
|
101 |
+
Once the image is built, run the application using this command:
|
102 |
+
|
103 |
+
```bash
|
104 |
+
docker run -p 5000:5000 objectdetection:latest
|
105 |
+
```
|
106 |
+
|
107 |
+
This will start the application on port 5000. Open your browser and go to `http://localhost:5000` to access the FastAPI interface.
|
108 |
+
|
109 |
+
### 3. **Demo**
|
110 |
+
|
111 |
+
You can try the demo directly online through Hugging Face's Spaces:
|
112 |
+
|
113 |
+
[Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
|
114 |
+
|
115 |
+
## Using the API
|
116 |
+
|
117 |
+
You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
|
118 |
+
|
119 |
+
**Endpoint**: `/detect`
|
120 |
+
|
121 |
+
**POST**: `/detect`
|
122 |
+
|
123 |
+
**Parameters**:
|
124 |
+
|
125 |
+
- `file`: (optional) Image file (must be of type `image/*`).
|
126 |
+
- `image_url`: (optional) URL of the image.
|
127 |
+
- `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
|
128 |
+
|
129 |
+
**Example Request Body**:
|
130 |
+
|
131 |
+
```json
|
132 |
+
{
|
133 |
+
"image_url": "https://example.com/image.jpg",
|
134 |
+
"model_name": "facebook/detr-resnet-50"
|
135 |
+
}
|
136 |
+
```
|
137 |
+
|
138 |
+
**Response**:
|
139 |
+
|
140 |
+
The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
|
141 |
+
|
142 |
+
```json
|
143 |
+
{
|
144 |
+
"image_url": "data:image/png;base64,...",
|
145 |
+
"detected_objects": ["person", "car"],
|
146 |
+
"confidence_scores": [0.95, 0.87],
|
147 |
+
"unique_objects": ["person", "car"],
|
148 |
+
"unique_confidence_scores": [0.95, 0.87]
|
149 |
+
}
|
150 |
+
```
|
151 |
+
|
152 |
+
## Development Setup
|
153 |
+
|
154 |
+
If you'd like to contribute or modify the application:
|
155 |
+
|
156 |
+
1. Clone the repository:
|
157 |
+
|
158 |
+
```bash
|
159 |
+
git clone https://github.com/NeerajCodz/ObjectDetection.git
|
160 |
+
cd ObjectDetection
|
161 |
+
```
|
162 |
+
|
163 |
+
2. Install dependencies:
|
164 |
+
|
165 |
+
```bash
|
166 |
+
pip install -r requirements.txt
|
167 |
+
```
|
168 |
+
|
169 |
+
3. Run the FastAPI server or Gradio interface:
|
170 |
+
|
171 |
+
```bash
|
172 |
+
uvicorn objectdetection:app --reload
|
173 |
+
```
|
174 |
+
|
175 |
+
or
|
176 |
+
|
177 |
+
```bash
|
178 |
+
python app.py
|
179 |
+
```
|
180 |
+
|
181 |
+
4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
|
182 |
+
|
183 |
+
## Contributing
|
184 |
+
|
185 |
+
Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.huggingface.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
sdk: gradio
|
2 |
+
python_version: 3.10
|
3 |
+
app_file: app.py
|
4 |
+
title: Object Detection App
|
5 |
+
subtitle: Real-time object detection in images using Gradio
|
6 |
+
hardware: cpu-basic
|
7 |
+
license: mit
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/docker-build-push.yml
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Build and Push Docker Image to Docker Hub
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches:
|
6 |
+
- main
|
7 |
+
|
8 |
+
jobs:
|
9 |
+
build-and-push:
|
10 |
+
runs-on: ubuntu-latest
|
11 |
+
steps:
|
12 |
+
- name: Checkout code
|
13 |
+
uses: actions/checkout@v4
|
14 |
+
|
15 |
+
- name: Log in to Docker Hub
|
16 |
+
uses: docker/login-action@v3
|
17 |
+
with:
|
18 |
+
username: ${{ secrets.DOCKER_USERNAME }}
|
19 |
+
password: ${{ secrets.DOCKER_PAT }}
|
20 |
+
|
21 |
+
- name: Build and push Docker image
|
22 |
+
uses: docker/build-push-action@v6
|
23 |
+
with:
|
24 |
+
context: .
|
25 |
+
push: true
|
26 |
+
tags: ${{ secrets.DOCKER_USERNAME }}/objectdetection:latest
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/hf-space-sync.yml
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Sync to Hugging Face Space
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches: [ main ]
|
6 |
+
|
7 |
+
jobs:
|
8 |
+
deploy-to-hf-space:
|
9 |
+
runs-on: ubuntu-latest
|
10 |
+
|
11 |
+
steps:
|
12 |
+
- name: Checkout Repository
|
13 |
+
uses: actions/checkout@v3
|
14 |
+
|
15 |
+
- name: Install Git
|
16 |
+
run: sudo apt-get install git
|
17 |
+
|
18 |
+
- name: Push to Hugging Face Space
|
19 |
+
env:
|
20 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
21 |
+
HF_USERNAME: ${{ secrets.HF_USERNAME }}
|
22 |
+
EMAIL: ${{ secrets.EMAIL }}
|
23 |
+
run: |
|
24 |
+
git config --global user.email $EMAIL
|
25 |
+
git config --global user.name $HF_USERNAME
|
26 |
+
|
27 |
+
git clone https://$HF_USERNAME:[email protected]/spaces/$HF_USERNAME/ObjectDetection hf_space
|
28 |
+
rsync -av --exclude='.git' ./ hf_space/
|
29 |
+
cd hf_space
|
30 |
+
git add .
|
31 |
+
if git diff --cached --quiet; then
|
32 |
+
echo "✅ No changes to commit."
|
33 |
+
else
|
34 |
+
git commit -m "Sync from GitHub"
|
35 |
+
git push
|
36 |
+
fi
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
__pycache__/
|
2 |
+
venv/
|
3 |
+
*.pyc
|
4 |
+
.DS_Store
|
5 |
+
.env
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/Dockerfile
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.11-slim
|
2 |
+
|
3 |
+
WORKDIR /app
|
4 |
+
|
5 |
+
COPY requirements.txt .
|
6 |
+
|
7 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
8 |
+
|
9 |
+
COPY app.py .
|
10 |
+
|
11 |
+
EXPOSE 5000
|
12 |
+
|
13 |
+
CMD ["python", "app.py"]
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
MIT License
|
2 |
+
|
3 |
+
Copyright (c) 2025 Neeraj Sathish Kumar
|
4 |
+
|
5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6 |
+
of this software and associated documentation files (the "Software"), to deal
|
7 |
+
in the Software without restriction, including without limitation the rights
|
8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9 |
+
copies of the Software, and to permit persons to whom the Software is
|
10 |
+
furnished to do so, subject to the following conditions:
|
11 |
+
|
12 |
+
The above copyright notice and this permission notice shall be included in all
|
13 |
+
copies or substantial portions of the Software.
|
14 |
+
|
15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21 |
+
SOFTWARE.
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py
ADDED
@@ -0,0 +1,384 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import torch
|
3 |
+
from transformers import DetrImageProcessor, DetrForObjectDetection
|
4 |
+
from transformers import YolosImageProcessor, YolosForObjectDetection
|
5 |
+
from transformers import DetrForSegmentation
|
6 |
+
from PIL import Image, ImageDraw, ImageStat
|
7 |
+
import requests
|
8 |
+
from io import BytesIO
|
9 |
+
import base64
|
10 |
+
from collections import Counter
|
11 |
+
import logging
|
12 |
+
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
13 |
+
from fastapi.responses import JSONResponse
|
14 |
+
import uvicorn
|
15 |
+
import pandas as pd
|
16 |
+
import traceback
|
17 |
+
import os
|
18 |
+
|
19 |
+
# Set up logging
|
20 |
+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
21 |
+
logger = logging.getLogger(__name__)
|
22 |
+
|
23 |
+
# Constants
|
24 |
+
CONFIDENCE_THRESHOLD = 0.5
|
25 |
+
VALID_MODELS = [
|
26 |
+
"facebook/detr-resnet-50",
|
27 |
+
"facebook/detr-resnet-101",
|
28 |
+
"facebook/detr-resnet-50-panoptic",
|
29 |
+
"facebook/detr-resnet-101-panoptic",
|
30 |
+
"hustvl/yolos-tiny",
|
31 |
+
"hustvl/yolos-base"
|
32 |
+
]
|
33 |
+
MODEL_DESCRIPTIONS = {
|
34 |
+
"facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
|
35 |
+
"facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
|
36 |
+
"facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
|
37 |
+
"facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
|
38 |
+
"hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
|
39 |
+
"hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
|
40 |
+
}
|
41 |
+
|
42 |
+
# Lazy model loading
|
43 |
+
models = {}
|
44 |
+
processors = {}
|
45 |
+
|
46 |
+
def process(image, model_name):
|
47 |
+
"""Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
|
48 |
+
try:
|
49 |
+
if model_name not in VALID_MODELS:
|
50 |
+
raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
|
51 |
+
|
52 |
+
# Load model and processor
|
53 |
+
if model_name not in models:
|
54 |
+
logger.info(f"Loading model: {model_name}")
|
55 |
+
if "yolos" in model_name:
|
56 |
+
models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
|
57 |
+
processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
|
58 |
+
elif "panoptic" in model_name:
|
59 |
+
models[model_name] = DetrForSegmentation.from_pretrained(model_name)
|
60 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
61 |
+
else:
|
62 |
+
models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
|
63 |
+
processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
|
64 |
+
|
65 |
+
model, processor = models[model_name], processors[model_name]
|
66 |
+
inputs = processor(images=image, return_tensors="pt")
|
67 |
+
|
68 |
+
with torch.no_grad():
|
69 |
+
outputs = model(**inputs)
|
70 |
+
|
71 |
+
target_sizes = torch.tensor([image.size[::-1]])
|
72 |
+
draw = ImageDraw.Draw(image)
|
73 |
+
object_names = []
|
74 |
+
confidence_scores = []
|
75 |
+
object_counter = Counter()
|
76 |
+
|
77 |
+
if "panoptic" in model_name:
|
78 |
+
processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
|
79 |
+
results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
|
80 |
+
|
81 |
+
for segment in results["segments_info"]:
|
82 |
+
label = segment["label_id"]
|
83 |
+
label_name = model.config.id2label.get(label, "Unknown")
|
84 |
+
score = segment.get("score", 1.0)
|
85 |
+
|
86 |
+
if "masks" in results and segment["id"] < len(results["masks"]):
|
87 |
+
mask = results["masks"][segment["id"]].cpu().numpy()
|
88 |
+
if mask.shape[0] > 0 and mask.shape[1] > 0:
|
89 |
+
mask_image = Image.fromarray((mask * 255).astype("uint8"))
|
90 |
+
colored_mask = Image.new("RGBA", image.size, (0, 0, 0, 0))
|
91 |
+
mask_draw = ImageDraw.Draw(colored_mask)
|
92 |
+
r, g, b = (segment["id"] * 50) % 255, (segment["id"] * 100) % 255, (segment["id"] * 150) % 255
|
93 |
+
mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
|
94 |
+
image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
|
95 |
+
draw = ImageDraw.Draw(image)
|
96 |
+
|
97 |
+
if score > CONFIDENCE_THRESHOLD:
|
98 |
+
object_names.append(label_name)
|
99 |
+
confidence_scores.append(float(score))
|
100 |
+
object_counter[label_name] = float(score)
|
101 |
+
else:
|
102 |
+
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
|
103 |
+
|
104 |
+
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
|
105 |
+
if score > CONFIDENCE_THRESHOLD:
|
106 |
+
x, y, x2, y2 = box.tolist()
|
107 |
+
draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
|
108 |
+
label_name = model.config.id2label.get(label.item(), "Unknown")
|
109 |
+
# Place text at top-right corner, outside the box, with smaller size
|
110 |
+
text = f"{label_name}: {score:.2f}"
|
111 |
+
text_bbox = draw.textbbox((0, 0), text)
|
112 |
+
text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
|
113 |
+
draw.text((x2 - text_width - 2, y - text_height - 2), text, fill="#32CD32")
|
114 |
+
object_names.append(label_name)
|
115 |
+
confidence_scores.append(float(score))
|
116 |
+
object_counter[label_name] = float(score)
|
117 |
+
|
118 |
+
unique_objects = list(object_counter.keys())
|
119 |
+
unique_confidences = [object_counter[obj] for obj in unique_objects]
|
120 |
+
|
121 |
+
# Image properties
|
122 |
+
file_size = "Unknown"
|
123 |
+
if hasattr(image, "fp") and image.fp is not None:
|
124 |
+
buffered = BytesIO()
|
125 |
+
image.save(buffered, format="PNG")
|
126 |
+
file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
|
127 |
+
|
128 |
+
# Color statistics
|
129 |
+
try:
|
130 |
+
stat = ImageStat.Stat(image)
|
131 |
+
color_stats = {
|
132 |
+
"mean": [f"{m:.2f}" for m in stat.mean],
|
133 |
+
"stddev": [f"{s:.2f}" for s in stat.stddev]
|
134 |
+
}
|
135 |
+
except Exception as e:
|
136 |
+
logger.error(f"Error calculating color statistics: {str(e)}")
|
137 |
+
color_stats = {"mean": "Error", "stddev": "Error"}
|
138 |
+
|
139 |
+
properties = {
|
140 |
+
"Format": image.format if hasattr(image, "format") and image.format else "Unknown",
|
141 |
+
"Size": f"{image.width}x{image.height}",
|
142 |
+
"Width": f"{image.width} px",
|
143 |
+
"Height": f"{image.height} px",
|
144 |
+
"Mode": image.mode,
|
145 |
+
"Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
|
146 |
+
"File Size": file_size,
|
147 |
+
"Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
|
148 |
+
"StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
|
149 |
+
}
|
150 |
+
|
151 |
+
return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
|
152 |
+
except Exception as e:
|
153 |
+
logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
|
154 |
+
raise
|
155 |
+
|
156 |
+
# FastAPI Setup
|
157 |
+
app = FastAPI(title="Object Detection API")
|
158 |
+
|
159 |
+
@app.post("/detect")
|
160 |
+
async def detect_objects_endpoint(
|
161 |
+
file: UploadFile = File(None),
|
162 |
+
image_url: str = Form(None),
|
163 |
+
model_name: str = Form(VALID_MODELS[0])
|
164 |
+
):
|
165 |
+
"""FastAPI endpoint to detect objects in an image from file or URL."""
|
166 |
+
try:
|
167 |
+
if (file is None and not image_url) or (file is not None and image_url):
|
168 |
+
raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
|
169 |
+
|
170 |
+
if file:
|
171 |
+
if not file.content_type.startswith("image/"):
|
172 |
+
raise HTTPException(status_code=400, detail="File must be an image")
|
173 |
+
contents = await file.read()
|
174 |
+
image = Image.open(BytesIO(contents)).convert("RGB")
|
175 |
+
else:
|
176 |
+
response = requests.get(image_url, timeout=10)
|
177 |
+
response.raise_for_status()
|
178 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
179 |
+
|
180 |
+
if model_name not in VALID_MODELS:
|
181 |
+
raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
|
182 |
+
|
183 |
+
detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
|
184 |
+
|
185 |
+
buffered = BytesIO()
|
186 |
+
detected_image.save(buffered, format="PNG")
|
187 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
188 |
+
img_url = f"data:image/png;base64,{img_base64}"
|
189 |
+
|
190 |
+
return JSONResponse(content={
|
191 |
+
"image_url": img_url,
|
192 |
+
"detected_objects": detected_objects,
|
193 |
+
"confidence_scores": detected_confidences,
|
194 |
+
"unique_objects": unique_objects,
|
195 |
+
"unique_confidence_scores": unique_confidences
|
196 |
+
})
|
197 |
+
except Exception as e:
|
198 |
+
logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
|
199 |
+
raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
|
200 |
+
|
201 |
+
# Gradio UI
|
202 |
+
def create_gradio_ui():
|
203 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
|
204 |
+
gr.Markdown(
|
205 |
+
"""
|
206 |
+
# 🚀 Object Detection App
|
207 |
+
Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
|
208 |
+
"""
|
209 |
+
)
|
210 |
+
|
211 |
+
with gr.Tabs():
|
212 |
+
with gr.Tab("📷 Image Upload"):
|
213 |
+
with gr.Row():
|
214 |
+
with gr.Column(scale=1):
|
215 |
+
gr.Markdown("### Input")
|
216 |
+
model_choice = gr.Dropdown(
|
217 |
+
choices=VALID_MODELS,
|
218 |
+
value=VALID_MODELS[0],
|
219 |
+
label="🔎 Select Model",
|
220 |
+
info="Choose a model for object detection or panoptic segmentation."
|
221 |
+
)
|
222 |
+
model_info = gr.Markdown(
|
223 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
224 |
+
visible=True
|
225 |
+
)
|
226 |
+
image_input = gr.Image(type="pil", label="📷 Upload Image")
|
227 |
+
image_url_input = gr.Textbox(
|
228 |
+
label="🔗 Image URL",
|
229 |
+
placeholder="https://example.com/image.jpg"
|
230 |
+
)
|
231 |
+
with gr.Row():
|
232 |
+
submit_btn = gr.Button("✨ Detect", variant="primary")
|
233 |
+
clear_btn = gr.Button("🗑️ Clear", variant="secondary")
|
234 |
+
|
235 |
+
model_choice.change(
|
236 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
237 |
+
inputs=model_choice,
|
238 |
+
outputs=model_info
|
239 |
+
)
|
240 |
+
|
241 |
+
with gr.Column(scale=2):
|
242 |
+
gr.Markdown("### Results")
|
243 |
+
error_output = gr.Textbox(
|
244 |
+
label="⚠️ Errors",
|
245 |
+
visible=False,
|
246 |
+
lines=3,
|
247 |
+
max_lines=5
|
248 |
+
)
|
249 |
+
output_image = gr.Image(
|
250 |
+
type="pil",
|
251 |
+
label="🎯 Detected Image",
|
252 |
+
interactive=False
|
253 |
+
)
|
254 |
+
with gr.Row():
|
255 |
+
objects_output = gr.DataFrame(
|
256 |
+
label="📋 Detected Objects",
|
257 |
+
interactive=False,
|
258 |
+
value=None
|
259 |
+
)
|
260 |
+
unique_objects_output = gr.DataFrame(
|
261 |
+
label="🔍 Unique Objects",
|
262 |
+
interactive=False,
|
263 |
+
value=None
|
264 |
+
)
|
265 |
+
properties_output = gr.DataFrame(
|
266 |
+
label="📄 Image Properties",
|
267 |
+
interactive=False,
|
268 |
+
value=None
|
269 |
+
)
|
270 |
+
|
271 |
+
def process_for_gradio(image, url, model_name):
|
272 |
+
try:
|
273 |
+
if image is None and not url:
|
274 |
+
return None, None, None, None, "Please provide an image or URL"
|
275 |
+
if image and url:
|
276 |
+
return None, None, None, None, "Please provide either an image or URL, not both"
|
277 |
+
|
278 |
+
if url:
|
279 |
+
response = requests.get(url, timeout=10)
|
280 |
+
response.raise_for_status()
|
281 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
282 |
+
|
283 |
+
detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
|
284 |
+
objects_df = pd.DataFrame({
|
285 |
+
"Object": objects,
|
286 |
+
"Confidence Score": [f"{score:.2f}" for score in scores]
|
287 |
+
}) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
|
288 |
+
unique_objects_df = pd.DataFrame({
|
289 |
+
"Unique Object": unique_objects,
|
290 |
+
"Confidence Score": [f"{score:.2f}" for score in unique_scores]
|
291 |
+
}) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
|
292 |
+
properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
|
293 |
+
return detected_image, objects_df, unique_objects_df, properties_df, ""
|
294 |
+
except Exception as e:
|
295 |
+
error_msg = f"Error processing image: {str(e)}"
|
296 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
297 |
+
return None, None, None, None, error_msg
|
298 |
+
|
299 |
+
submit_btn.click(
|
300 |
+
fn=process_for_gradio,
|
301 |
+
inputs=[image_input, image_url_input, model_choice],
|
302 |
+
outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
|
303 |
+
)
|
304 |
+
|
305 |
+
clear_btn.click(
|
306 |
+
fn=lambda: [None, "", None, None, None, None],
|
307 |
+
inputs=None,
|
308 |
+
outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
|
309 |
+
)
|
310 |
+
|
311 |
+
with gr.Tab("🔗 URL Input"):
|
312 |
+
gr.Markdown("### Process Image from URL")
|
313 |
+
image_url_input = gr.Textbox(
|
314 |
+
label="🔗 Image URL",
|
315 |
+
placeholder="https://example.com/image.jpg"
|
316 |
+
)
|
317 |
+
url_model_choice = gr.Dropdown(
|
318 |
+
choices=VALID_MODELS,
|
319 |
+
value=VALID_MODELS[0],
|
320 |
+
label="🔎 Select Model"
|
321 |
+
)
|
322 |
+
url_model_info = gr.Markdown(
|
323 |
+
f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
|
324 |
+
visible=True
|
325 |
+
)
|
326 |
+
url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
|
327 |
+
url_output = gr.JSON(label="API Response")
|
328 |
+
|
329 |
+
url_model_choice.change(
|
330 |
+
fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
|
331 |
+
inputs=url_model_choice,
|
332 |
+
outputs=url_model_info
|
333 |
+
)
|
334 |
+
|
335 |
+
def process_url_for_gradio(url, model_name):
|
336 |
+
try:
|
337 |
+
response = requests.get(url, timeout=10)
|
338 |
+
response.raise_for_status()
|
339 |
+
image = Image.open(BytesIO(response.content)).convert("RGB")
|
340 |
+
detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
|
341 |
+
buffered = BytesIO()
|
342 |
+
detected_image.save(buffered, format="PNG")
|
343 |
+
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
|
344 |
+
return {
|
345 |
+
"image_url": f"data:image/png;base64,{img_base64}",
|
346 |
+
"detected_objects": objects,
|
347 |
+
"confidence_scores": scores,
|
348 |
+
"unique_objects": unique_objects,
|
349 |
+
"unique_confidence_scores": unique_scores
|
350 |
+
}
|
351 |
+
except Exception as e:
|
352 |
+
error_msg = f"Error processing URL: {str(e)}"
|
353 |
+
logger.error(f"{error_msg}\n{traceback.format_exc()}")
|
354 |
+
return {"error": error_msg}
|
355 |
+
|
356 |
+
url_submit_btn.click(
|
357 |
+
fn=process_url_for_gradio,
|
358 |
+
inputs=[image_url_input, url_model_choice],
|
359 |
+
outputs=[url_output]
|
360 |
+
)
|
361 |
+
|
362 |
+
with gr.Tab("ℹ️ Help"):
|
363 |
+
gr.Markdown(
|
364 |
+
"""
|
365 |
+
## How to Use
|
366 |
+
- **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
|
367 |
+
- **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
|
368 |
+
- **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
|
369 |
+
- **Clear**: Reset all inputs and outputs using the "Clear" button.
|
370 |
+
- **Errors**: Check the error box for any processing issues.
|
371 |
+
|
372 |
+
## Tips
|
373 |
+
- Use high-quality images for better detection results.
|
374 |
+
- Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
|
375 |
+
- For faster processing, try YOLOS-Tiny on resource-constrained devices.
|
376 |
+
"""
|
377 |
+
)
|
378 |
+
|
379 |
+
return demo
|
380 |
+
|
381 |
+
if __name__ == "__main__":
|
382 |
+
demo = create_gradio_ui()
|
383 |
+
demo.launch()
|
384 |
+
# To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitattributes
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: ObjectDetection
|
3 |
+
emoji: 🦀
|
4 |
+
colorFrom: green
|
5 |
+
colorTo: yellow
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.29.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
---
|
11 |
+
|
12 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/requirements.txt
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
transformers
|
2 |
+
torch
|
3 |
+
tensorflow
|
4 |
+
gradio
|
5 |
+
pillow
|
6 |
+
timm
|
7 |
+
fastapi
|
8 |
+
requests
|
requirements.txt
CHANGED
@@ -5,4 +5,7 @@ gradio
|
|
5 |
pillow
|
6 |
timm
|
7 |
fastapi
|
8 |
-
requests
|
|
|
|
|
|
|
|
5 |
pillow
|
6 |
timm
|
7 |
fastapi
|
8 |
+
requests
|
9 |
+
uvicorn
|
10 |
+
pandas
|
11 |
+
nest_asyncio
|