NeerajCodz commited on
Commit
d3a3e0d
·
1 Parent(s): e581bf6

Sync from GitHub

Browse files
README.md CHANGED
@@ -44,7 +44,7 @@ Follow these steps to set up the application locally:
44
  #### Clone the Repository
45
 
46
  ```bash
47
- git clone https://github.com/NeerajCodz/ObjectDetection
48
  cd ObjectDetection
49
  ```
50
 
@@ -136,11 +136,13 @@ The `app.py` script supports the following command-line arguments:
136
  - Example: `python app.py --enable-fastapi`
137
  - `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
138
  - Example: `python app.py --enable-fastapi --fastapi-port 8001`
 
 
139
 
140
  You can combine arguments:
141
 
142
  ```bash
143
- python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
144
  ```
145
 
146
  Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
@@ -186,16 +188,16 @@ Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
186
  #### Using `curl` with an Image URL
187
 
188
  ```bash
189
- curl -X POST "http://localhost:8000/detect" \
190
- -H "Content-Type: application/json" \
191
  -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
192
  ```
193
 
194
  #### Using `curl` with an Image File
195
 
196
  ```bash
197
- curl -X POST "http://localhost:8000/detect" \
198
- -F "file=@/path/to/image.jpg" \
199
  -F "model_name=facebook/detr-resnet-50"
200
  ```
201
 
@@ -226,7 +228,7 @@ To contribute or modify the application:
226
  1. Clone the repository:
227
 
228
  ```bash
229
- git clone https://github.com/NeerajCodz/ObjectDetection
230
  cd ObjectDetection
231
  ```
232
 
@@ -265,8 +267,10 @@ Please include tests and documentation for new features. Report issues via GitHu
265
  ## Troubleshooting
266
 
267
  - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
268
- - **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab.
 
269
  - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
270
  - **API Instability**: Test with smaller images and object detection models first.
 
271
 
272
  For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
 
44
  #### Clone the Repository
45
 
46
  ```bash
47
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
48
  cd ObjectDetection
49
  ```
50
 
 
136
  - Example: `python app.py --enable-fastapi`
137
  - `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
138
  - Example: `python app.py --enable-fastapi --fastapi-port 8001`
139
+ - `--confidence-threshold <float-value)`: Confidence threshold for detection (Range: 0 - 1) (default: 8000).
140
+ - Example: `python app.py --confidence-threshold 0.75`
141
 
142
  You can combine arguments:
143
 
144
  ```bash
145
+ python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75
146
  ```
147
 
148
  Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
 
188
  #### Using `curl` with an Image URL
189
 
190
  ```bash
191
+ curl -X POST "http://localhost:8000/detect" \\
192
+ -H "Content-Type: application/json" \\
193
  -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
194
  ```
195
 
196
  #### Using `curl` with an Image File
197
 
198
  ```bash
199
+ curl -X POST "http://localhost:8000/detect" \\
200
+ -F "file=@/path/to/image.jpg" \\
201
  -F "model_name=facebook/detr-resnet-50"
202
  ```
203
 
 
228
  1. Clone the repository:
229
 
230
  ```bash
231
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
232
  cd ObjectDetection
233
  ```
234
 
 
267
  ## Troubleshooting
268
 
269
  - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
270
+ - Example: `python app.py --gradio-port 7870`
271
+ - **Colab Asyncio Error**: If you encounter `RuntimeError: asyncio.run() cannot be called from a running event loop` in Colab, the application now uses `nest_asyncio` to handle this. Ensure `nest_asyncio` is installed (`pip install nest_asyncio`).
272
  - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
273
  - **API Instability**: Test with smaller images and object detection models first.
274
+ - **FastAPI Not Starting**: Ensure `--enable-fastapi` is used, and check that the specified `--fastapi-port` (default: 8000) is available.
275
 
276
  For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
app.py CHANGED
@@ -3,11 +3,10 @@ import base64
3
  import logging
4
  import os
5
  import sys
6
- import traceback
7
  import threading
8
  from collections import Counter
9
  from io import BytesIO
10
- from typing import Dict, List, Optional, Tuple
11
 
12
  import gradio as gr
13
  import pandas as pd
@@ -30,15 +29,12 @@ import nest_asyncio
30
  # Configuration
31
  # ------------------------------
32
 
33
- # Logging configuration
34
- logging.basicConfig(
35
- level=logging.INFO,
36
- format="%(asctime)s - %(levelname)s - %(message)s",
37
- )
38
  logger = logging.getLogger(__name__)
39
 
40
- # Model and processing constants
41
- CONFIDENCE_THRESHOLD: float = 0.5
42
  VALID_MODELS: List[str] = [
43
  "facebook/detr-resnet-50",
44
  "facebook/detr-resnet-101",
@@ -48,128 +44,109 @@ VALID_MODELS: List[str] = [
48
  "hustvl/yolos-base",
49
  ]
50
  MODEL_DESCRIPTIONS: Dict[str, str] = {
51
- "facebook/detr-resnet-50": (
52
- "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use."
53
- ),
54
- "facebook/detr-resnet-101": (
55
- "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50."
56
- ),
57
- "facebook/detr-resnet-50-panoptic": (
58
- "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes."
59
- ),
60
- "facebook/detr-resnet-101-panoptic": (
61
- "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes."
62
- ),
63
- "hustvl/yolos-tiny": (
64
- "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments."
65
- ),
66
- "hustvl/yolos-base": (
67
- "YOLOS Base model. Balances speed and accuracy for object detection."
68
- ),
69
  }
70
-
71
- # Port configuration
72
- DEFAULT_GRADIO_PORT: int = 7860
73
- DEFAULT_FASTAPI_PORT: int = 8000
74
- PORT_RANGE: range = range(7860, 7870) # Try ports 7860-7869
75
- MAX_PORT_ATTEMPTS: int = 10
76
 
77
  # Thread-safe storage for lazy-loaded models and processors
78
  models: Dict[str, any] = {}
79
  processors: Dict[str, any] = {}
80
  model_lock = threading.Lock()
81
 
82
- # ------------------------------
83
- # Model Loading
84
- # ------------------------------
85
-
86
- def load_model_and_processor(model_name: str) -> Tuple[any, any]:
87
- """
88
- Load and cache the specified model and processor thread-safely.
89
-
90
- Args:
91
- model_name: Name of the model to load (must be in VALID_MODELS).
92
-
93
- Returns:
94
- Tuple containing the loaded model and processor.
95
-
96
- Raises:
97
- ValueError: If the model_name is invalid or loading fails.
98
- """
99
- with model_lock:
100
- if model_name not in models:
101
- logger.info(f"Loading model: {model_name}")
102
- try:
103
- if "yolos" in model_name:
104
- models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
105
- processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
106
- elif "panoptic" in model_name:
107
- models[model_name] = DetrForSegmentation.from_pretrained(model_name)
108
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
109
- else:
110
- models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
111
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
112
- logger.debug(f"Model {model_name} loaded successfully")
113
- except Exception as e:
114
- logger.error(f"Failed to load model {model_name}: {str(e)}")
115
- raise ValueError(f"Failed to load model: {str(e)}")
116
- return models[model_name], processors[model_name]
117
-
118
  # ------------------------------
119
  # Image Processing
120
  # ------------------------------
121
 
122
- def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str], List[float], List[str], List[float], Dict[str, str]]:
 
 
 
 
 
 
123
  """
124
- Process an image for object detection or panoptic segmentation.
125
 
126
  Args:
127
- image: PIL Image to process.
 
128
  model_name: Name of the model to use (must be in VALID_MODELS).
 
 
129
 
130
  Returns:
131
- Tuple containing:
132
- - Annotated image (PIL Image).
133
- - List of detected object names.
134
- - List of confidence scores for detected objects.
135
- - List of unique object names.
136
- - List of confidence scores for unique objects.
137
- - Dictionary of image properties (format, size, etc.).
138
-
139
- Raises:
140
- ValueError: If the model_name is invalid.
141
- RuntimeError: If processing fails due to model or image issues.
142
  """
143
- if model_name not in VALID_MODELS:
144
- raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
145
-
146
  try:
147
- # Load model and processor
148
- model, processor = load_model_and_processor(model_name)
149
- logger.debug(f"Processing image with model: {model_name}")
 
 
 
 
 
150
 
151
- # Prepare image for processing
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  inputs = processor(images=image, return_tensors="pt")
153
  with torch.no_grad():
154
  outputs = model(**inputs)
155
 
156
- # Initialize drawing context
157
  draw = ImageDraw.Draw(image)
158
  object_names: List[str] = []
159
  confidence_scores: List[float] = []
160
  object_counter = Counter()
161
  target_sizes = torch.tensor([image.size[::-1]])
162
 
163
- # Process panoptic segmentation or object detection
164
  if "panoptic" in model_name:
 
165
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
166
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
167
-
168
  for segment in results["segments_info"]:
169
  label = segment["label_id"]
170
  label_name = model.config.id2label.get(label, "Unknown")
171
  score = segment.get("score", 1.0)
172
-
173
  # Apply segmentation mask if available
174
  if "masks" in results and segment["id"] < len(results["masks"]):
175
  mask = results["masks"][segment["id"]].cpu().numpy()
@@ -181,67 +158,92 @@ def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str]
181
  mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
182
  image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
183
  draw = ImageDraw.Draw(image)
184
-
185
- if score > CONFIDENCE_THRESHOLD:
186
  object_names.append(label_name)
187
  confidence_scores.append(float(score))
188
  object_counter[label_name] = float(score)
189
  else:
 
190
  results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
191
-
192
  for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
193
- if score > CONFIDENCE_THRESHOLD:
194
  x, y, x2, y2 = box.tolist()
195
- draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
196
  label_name = model.config.id2label.get(label.item(), "Unknown")
197
  text = f"{label_name}: {score:.2f}"
198
  text_bbox = draw.textbbox((0, 0), text)
199
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
200
- draw.text((x2 - text_width - 2, y - text_height - 2), text, fill="#32CD32")
 
 
 
201
  object_names.append(label_name)
202
  confidence_scores.append(float(score))
203
  object_counter[label_name] = float(score)
204
 
205
- # Compile unique objects and confidences
206
  unique_objects = list(object_counter.keys())
207
  unique_confidences = [object_counter[obj] for obj in unique_objects]
208
 
209
- # Calculate image properties
210
  properties: Dict[str, str] = {
211
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
212
  "Size": f"{image.width}x{image.height}",
213
  "Width": f"{image.width} px",
214
  "Height": f"{image.height} px",
215
  "Mode": image.mode,
216
- "Aspect Ratio": (
217
- f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined"
218
- ),
219
  "File Size": "Unknown",
220
  "Mean (R,G,B)": "Unknown",
221
  "StdDev (R,G,B)": "Unknown",
222
  }
223
-
224
- # Compute file size
225
  try:
 
226
  buffered = BytesIO()
227
  image.save(buffered, format="PNG")
228
  properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
229
- except Exception as e:
230
- logger.error(f"Error calculating file size: {str(e)}")
231
-
232
- # Compute color statistics
233
- try:
234
  stat = ImageStat.Stat(image)
235
  properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
236
  properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
237
  except Exception as e:
238
- logger.error(f"Error calculating color statistics: {str(e)}")
239
 
240
- return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
241
 
 
 
 
 
 
242
  except Exception as e:
243
- logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
244
- raise RuntimeError(f"Failed to process image: {str(e)}")
 
 
245
 
246
  # ------------------------------
247
  # FastAPI Setup
@@ -254,6 +256,7 @@ async def detect_objects_endpoint(
254
  file: Optional[UploadFile] = File(None),
255
  image_url: Optional[str] = Form(None),
256
  model_name: str = Form(VALID_MODELS[0]),
 
257
  ) -> JSONResponse:
258
  """
259
  FastAPI endpoint to detect objects in an image from file upload or URL.
@@ -262,62 +265,35 @@ async def detect_objects_endpoint(
262
  file: Uploaded image file (optional).
263
  image_url: URL of the image (optional).
264
  model_name: Model to use for detection (default: first VALID_MODELS entry).
 
265
 
266
  Returns:
267
- JSONResponse containing the processed image (base64), detected objects, and confidences.
268
 
269
  Raises:
270
- HTTPException: If input validation fails or processing errors occur.
271
  """
272
  try:
273
- # Validate input
274
  if (file is None and not image_url) or (file is not None and image_url):
275
- raise HTTPException(
276
- status_code=400,
277
- detail="Provide either an image file or an image URL, not both.",
278
- )
279
-
280
- # Load image
281
  if file:
282
  if not file.content_type.startswith("image/"):
283
  raise HTTPException(status_code=400, detail="File must be an image")
284
  contents = await file.read()
285
  image = Image.open(BytesIO(contents)).convert("RGB")
286
- else:
287
- response = requests.get(image_url, timeout=10)
288
- response.raise_for_status()
289
- image = Image.open(BytesIO(response.content)).convert("RGB")
290
-
291
- if model_name not in VALID_MODELS:
292
- raise HTTPException(
293
- status_code=400,
294
- detail=f"Invalid model. Choose from: {VALID_MODELS}",
295
- )
296
-
297
- # Process image
298
- detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(
299
- image, model_name
300
- )
301
-
302
- # Encode image as base64
303
- buffered = BytesIO()
304
- detected_image.save(buffered, format="PNG")
305
- img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
306
- img_url = f"data:image/png;base64,{img_base64}"
307
-
308
- return JSONResponse(
309
- content={
310
- "image_url": img_url,
311
- "detected_objects": detected_objects,
312
- "confidence_scores": detected_confidences,
313
- "unique_objects": unique_objects,
314
- "unique_confidence_scores": unique_confidences,
315
- }
316
- )
317
-
318
- except requests.RequestException as e:
319
- logger.error(f"Error fetching image from URL: {str(e)}")
320
- raise HTTPException(status_code=400, detail=f"Failed to fetch image: {str(e)}")
321
  except Exception as e:
322
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
323
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
@@ -328,7 +304,7 @@ async def detect_objects_endpoint(
328
 
329
  def create_gradio_ui() -> gr.Blocks:
330
  """
331
- Create and configure the Gradio UI for object detection.
332
 
333
  Returns:
334
  Gradio Blocks object representing the UI.
@@ -337,257 +313,126 @@ def create_gradio_ui() -> gr.Blocks:
337
  RuntimeError: If UI creation fails.
338
  """
339
  try:
340
- with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as app:
 
 
341
  gr.Markdown(
342
  f"""
343
  # 🚀 Object Detection App
344
- Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
345
  Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
346
  """
347
  )
348
 
 
349
  with gr.Tabs():
350
- with gr.Tab("📷 Image Upload"):
 
351
  with gr.Row():
 
352
  with gr.Column(scale=1):
353
  gr.Markdown("### Input")
354
- model_choice = gr.Dropdown(
355
- choices=VALID_MODELS,
356
- value=VALID_MODELS[0],
357
- label="🔎 Select Model",
358
- info="Choose a model for object detection or panoptic segmentation.",
359
- )
360
- model_info = gr.Markdown(
361
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
362
- visible=True,
363
- )
364
  image_input = gr.Image(type="pil", label="📷 Upload Image")
365
- image_url_input = gr.Textbox(
366
- label="🔗 Image URL",
367
- placeholder="https://example.com/image.jpg",
368
- )
369
  with gr.Row():
370
  submit_btn = gr.Button("✨ Detect", variant="primary")
371
  clear_btn = gr.Button("🗑️ Clear", variant="secondary")
372
 
 
373
  model_choice.change(
374
- fn=lambda model_name: (
375
- f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
376
- ),
377
  inputs=model_choice,
378
  outputs=model_info,
379
  )
380
 
 
381
  with gr.Column(scale=2):
382
  gr.Markdown("### Results")
383
- error_output = gr.Textbox(
384
- label="⚠️ Errors",
385
- visible=False,
386
- lines=3,
387
- max_lines=5,
388
- )
389
- output_image = gr.Image(
390
- type="pil",
391
- label="🎯 Detected Image",
392
- interactive=False,
393
- )
394
  with gr.Row():
395
- objects_output = gr.DataFrame(
396
- label="📋 Detected Objects",
397
- interactive=False,
398
- value=None,
399
- )
400
- unique_objects_output = gr.DataFrame(
401
- label="🔍 Unique Objects",
402
- interactive=False,
403
- value=None,
404
- )
405
- properties_output = gr.DataFrame(
406
- label="📄 Image Properties",
407
- interactive=False,
408
- value=None,
409
- )
410
-
411
- def process_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Tuple[
412
- Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str
413
- ]:
414
- """
415
- Process image for Gradio UI and return results.
416
-
417
- Args:
418
- image: Uploaded PIL Image (optional).
419
- url: Image URL (optional).
420
- model_name: Model to use for detection.
421
-
422
- Returns:
423
- Tuple of detected image, objects DataFrame, unique objects DataFrame, properties DataFrame, and error message.
424
- """
425
- try:
426
- if image is None and not url:
427
- return None, None, None, None, "Please provide an image or URL"
428
- if image and url:
429
- return None, None, None, None, "Please provide either an image or URL, not both"
430
-
431
- if url:
432
- response = requests.get(url, timeout=10)
433
- response.raise_for_status()
434
- image = Image.open(BytesIO(response.content)).convert("RGB")
435
-
436
- detected_image, objects, scores, unique_objects, unique_scores, properties = process(
437
- image, model_name
438
- )
439
- objects_df = (
440
- pd.DataFrame(
441
- {
442
- "Object": objects,
443
- "Confidence Score": [f"{score:.2f}" for score in scores],
444
- }
445
- )
446
- if objects
447
- else pd.DataFrame(columns=["Object", "Confidence Score"])
448
- )
449
- unique_objects_df = (
450
- pd.DataFrame(
451
- {
452
- "Unique Object": unique_objects,
453
- "Confidence Score": [f"{score:.2f}" for score in unique_scores],
454
- }
455
- )
456
- if unique_objects
457
- else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
458
- )
459
- properties_df = (
460
- pd.DataFrame([properties])
461
- if properties
462
- else pd.DataFrame(columns=properties.keys())
463
- )
464
- return detected_image, objects_df, unique_objects_df, properties_df, ""
465
-
466
- except requests.RequestException as e:
467
- error_msg = f"Error fetching image from URL: {str(e)}"
468
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
469
- return None, None, None, None, error_msg
470
- except Exception as e:
471
- error_msg = f"Error processing image: {str(e)}"
472
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
473
- return None, None, None, None, error_msg
474
 
 
475
  submit_btn.click(
476
- fn=process_for_gradio,
477
  inputs=[image_input, image_url_input, model_choice],
478
  outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
479
  )
480
 
 
481
  clear_btn.click(
482
  fn=lambda: [None, "", None, None, None, None],
483
  inputs=None,
484
- outputs=[
485
- image_input,
486
- image_url_input,
487
- output_image,
488
- objects_output,
489
- unique_objects_output,
490
- properties_output,
491
- error_output,
492
- ],
493
  )
494
 
495
- with gr.Tab("🔗 JSON Output"):
496
- gr.Markdown("### Process Image for JSON Output")
497
- image_input_json = gr.Image(type="pil", label="📷 Upload Image")
498
- image_url_input_json = gr.Textbox(
499
- label="🔗 Image URL",
500
- placeholder="https://example.com/image.jpg",
501
- )
502
- url_model_choice = gr.Dropdown(
503
- choices=VALID_MODELS,
504
- value=VALID_MODELS[0],
505
- label="🔎 Select Model",
506
- )
507
- url_model_info = gr.Markdown(
508
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
509
- visible=True,
510
- )
511
- url_submit_btn = gr.Button("🔄 Process", variant="primary")
512
- url_output = gr.JSON(label="API Response")
513
-
514
- url_model_choice.change(
515
- fn=lambda model_name: (
516
- f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
517
- ),
518
- inputs=url_model_choice,
519
- outputs=url_model_info,
520
- )
521
-
522
- def process_url_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Dict:
523
- """
524
- Process image from file or URL for Gradio UI and return JSON response.
525
-
526
- Args:
527
- image: Uploaded PIL Image (optional).
528
- url: Image URL (optional).
529
- model_name: Model to use for detection.
530
-
531
- Returns:
532
- Dictionary with processed image (base64), detected objects, and confidences.
533
- """
534
- try:
535
- if image is None and not url:
536
- return {"error": "Please provide an image or URL"}
537
- if image and url:
538
- return {"error": "Please provide either an image or URL, not both"}
539
-
540
- if url:
541
- response = requests.get(url, timeout=10)
542
- response.raise_for_status()
543
- image = Image.open(BytesIO(response.content)).convert("RGB")
544
-
545
- detected_image, objects, scores, unique_objects, unique_scores, _ = process(
546
- image, model_name
547
  )
548
- buffered = BytesIO()
549
- detected_image.save(buffered, format="PNG")
550
- img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
551
- return {
552
- "image_url": f"data:image/png;base64,{img_base64}",
553
- "detected_objects": objects,
554
- "confidence_scores": scores,
555
- "unique_objects": unique_objects,
556
- "unique_confidence_scores": unique_scores,
557
- }
558
- except requests.RequestException as e:
559
- error_msg = f"Error fetching image from URL: {str(e)}"
560
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
561
- return {"error": error_msg}
562
- except Exception as e:
563
- error_msg = f"Error processing image: {str(e)}"
564
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
565
- return {"error": error_msg}
566
 
 
 
 
 
 
 
567
  url_submit_btn.click(
568
- fn=process_url_for_gradio,
569
  inputs=[image_input_json, image_url_input_json, url_model_choice],
570
  outputs=[url_output],
571
  )
572
 
 
573
  with gr.Tab("ℹ️ Help"):
574
  gr.Markdown(
575
  """
576
  ## How to Use
577
- - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
578
- - **JSON Output**: Upload an image or enter a URL, select a model, and click "Process" to get results in JSON format.
579
- - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
580
- - **Clear**: Reset all inputs and outputs using the "Clear" button in the Image Upload tab.
581
- - **Errors**: Check the error box (Image Upload) or JSON response (JSON Output) for issues.
582
-
583
  ## Tips
584
- - Use high-quality images for better detection results.
585
- - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
586
- - For faster processing, try YOLOS-Tiny on resource-constrained devices.
587
  """
588
  )
589
 
590
- return app
591
 
592
  except Exception as e:
593
  logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
@@ -599,38 +444,25 @@ def create_gradio_ui() -> gr.Blocks:
599
 
600
  def parse_args() -> argparse.Namespace:
601
  """
602
- Parse command-line arguments with defaults and ignore unrecognized arguments.
603
 
604
  Returns:
605
  Parsed arguments as a Namespace object.
606
-
607
- Raises:
608
- SystemExit: If argument parsing fails (handled by argparse).
609
  """
610
- parser = argparse.ArgumentParser(
611
- description="Launcher for Object Detection App with Gradio UI and optional FastAPI server."
612
- )
613
- parser.add_argument(
614
- "--gradio-port",
615
- type=int,
616
- default=DEFAULT_GRADIO_PORT,
617
- help=f"Port for the Gradio UI (default: {DEFAULT_GRADIO_PORT}).",
618
- )
619
- parser.add_argument(
620
- "--enable-fastapi",
621
- action="store_true",
622
- default=False,
623
- help="Enable the FastAPI server (disabled by default).",
624
- )
625
- parser.add_argument(
626
- "--fastapi-port",
627
- type=int,
628
- default=DEFAULT_FASTAPI_PORT,
629
- help=f"Port for the FastAPI server if enabled (default: {DEFAULT_FASTAPI_PORT}).",
630
- )
631
-
632
- # Parse known arguments and ignore unrecognized ones (e.g., Jupyter kernel args)
633
  args, _ = parser.parse_known_args()
 
 
 
634
  return args
635
 
636
  def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
@@ -638,30 +470,21 @@ def find_available_port(start_port: int, port_range: range, max_attempts: int) -
638
  Find an available port within the specified range.
639
 
640
  Args:
641
- start_port: Initial port to try (e.g., from args or environment).
642
  port_range: Range of ports to attempt.
643
  max_attempts: Maximum number of ports to try.
644
 
645
  Returns:
646
  Available port number, or None if no port is found.
647
-
648
- Raises:
649
- OSError: If port binding fails for reasons other than port in use.
650
  """
651
  import socket
652
-
653
- port = start_port
654
  attempts = 0
655
-
656
- # Check environment variable GRADIO_SERVER_PORT
657
- env_port = os.getenv("GRADIO_SERVER_PORT")
658
- if env_port and env_port.isdigit():
659
- port = int(env_port)
660
- logger.info(f"Using GRADIO_SERVER_PORT from environment: {port}")
661
-
662
  while attempts < max_attempts:
663
  with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
664
  try:
 
665
  s.bind(("0.0.0.0", port))
666
  logger.debug(f"Port {port} is available")
667
  return port
@@ -672,70 +495,47 @@ def find_available_port(start_port: int, port_range: range, max_attempts: int) -
672
  attempts += 1
673
  else:
674
  raise
675
- except Exception as e:
676
- logger.error(f"Error checking port {port}: {str(e)}")
677
- raise
678
- logger.error(f"No available port found in range {min(port_range)}-{max(port_range)} after {max_attempts} attempts")
679
  return None
680
 
681
- def run_fastapi_server(host: str, port: int) -> None:
682
- """
683
- Run the FastAPI server using Uvicorn.
684
-
685
- Args:
686
- host: Host address for the FastAPI server.
687
- port: Port for the FastAPI server.
688
- """
689
- try:
690
- uvicorn.run(app, host=host, port=port)
691
- except Exception as e:
692
- logger.error(f"Error running FastAPI server: {str(e)}\n{traceback.format_exc()}")
693
- sys.exit(1)
694
-
695
  def main() -> None:
696
  """
697
- Main function to launch Gradio UI and optional FastAPI server.
698
 
699
  Raises:
700
- SystemExit: If the application is interrupted or encounters an error.
701
  """
702
  try:
703
- # Apply nest_asyncio to allow nested event loops in Jupyter/Colab
704
  nest_asyncio.apply()
705
-
706
  # Parse command-line arguments
707
  args = parse_args()
708
  logger.info(f"Parsed arguments: {args}")
709
-
710
  # Find available port for Gradio
711
  gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
712
  if gradio_port is None:
713
  logger.error("Failed to find an available port for Gradio UI")
714
  sys.exit(1)
715
 
716
- # Launch FastAPI server in a separate thread if enabled
717
  if args.enable_fastapi:
718
- logger.info(f"Starting FastAPI server on port {args.fastapi_port}")
719
  fastapi_thread = threading.Thread(
720
- target=run_fastapi_server,
721
- args=("0.0.0.0", args.fastapi_port),
722
  daemon=True
723
  )
724
  fastapi_thread.start()
725
 
726
  # Launch Gradio UI
727
  logger.info(f"Starting Gradio UI on port {gradio_port}")
728
- app = create_gradio_ui()
729
- app.launch(server_port=gradio_port, server_name="0.0.0.0")
730
 
731
  except KeyboardInterrupt:
732
  logger.info("Application terminated by user.")
733
  sys.exit(0)
734
- except OSError as e:
735
- logger.error(f"Port binding error: {str(e)}")
736
- sys.exit(1)
737
  except Exception as e:
738
- logger.error(f"Error running application: {str(e)}\n{traceback.format_exc()}")
739
  sys.exit(1)
740
 
741
  if __name__ == "__main__":
 
3
  import logging
4
  import os
5
  import sys
 
6
  import threading
7
  from collections import Counter
8
  from io import BytesIO
9
+ from typing import Dict, List, Optional, Tuple, Union
10
 
11
  import gradio as gr
12
  import pandas as pd
 
29
  # Configuration
30
  # ------------------------------
31
 
32
+ # Configure logging for debugging and monitoring
33
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
 
 
 
34
  logger = logging.getLogger(__name__)
35
 
36
+ # Define constants for model and server configuration
37
+ CONFIDENCE_THRESHOLD: float = 0.5 # Default threshold for object detection confidence
38
  VALID_MODELS: List[str] = [
39
  "facebook/detr-resnet-50",
40
  "facebook/detr-resnet-101",
 
44
  "hustvl/yolos-base",
45
  ]
46
  MODEL_DESCRIPTIONS: Dict[str, str] = {
47
+ "facebook/detr-resnet-50": "DETR with ResNet-50 for object detection. Fast and accurate.",
48
+ "facebook/detr-resnet-101": "DETR with ResNet-101 for object detection. More accurate, slower.",
49
+ "facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation.",
50
+ "facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation.",
51
+ "hustvl/yolos-tiny": "YOLOS Tiny. Lightweight and fast.",
52
+ "hustvl/yolos-base": "YOLOS Base. Balances speed and accuracy."
 
 
 
 
 
 
 
 
 
 
 
 
53
  }
54
+ DEFAULT_GRADIO_PORT: int = 7860 # Default port for Gradio UI
55
+ DEFAULT_FASTAPI_PORT: int = 8000 # Default port for FastAPI server
56
+ PORT_RANGE: range = range(7860, 7870) # Range of ports to try for Gradio
57
+ MAX_PORT_ATTEMPTS: int = 10 # Maximum attempts to find an available port
 
 
58
 
59
  # Thread-safe storage for lazy-loaded models and processors
60
  models: Dict[str, any] = {}
61
  processors: Dict[str, any] = {}
62
  model_lock = threading.Lock()
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  # ------------------------------
65
  # Image Processing
66
  # ------------------------------
67
 
68
+ def process_image(
69
+ image: Optional[Image.Image],
70
+ url: Optional[str],
71
+ model_name: str,
72
+ for_json: bool = False,
73
+ confidence_threshold: float = CONFIDENCE_THRESHOLD
74
+ ) -> Union[Dict, Tuple[Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str]]:
75
  """
76
+ Process an image for object detection or panoptic segmentation, handling Gradio and FastAPI inputs.
77
 
78
  Args:
79
+ image: PIL Image object from file upload (optional).
80
+ url: URL of the image to process (optional).
81
  model_name: Name of the model to use (must be in VALID_MODELS).
82
+ for_json: If True, return JSON dict for API/JSON tab; else, return tuple for Gradio Home tab.
83
+ confidence_threshold: Minimum confidence score for detection (default: 0.5).
84
 
85
  Returns:
86
+ For JSON: Dict with base64-encoded image, detected objects, and confidence scores.
87
+ For Gradio: Tuple of (annotated image, objects DataFrame, unique objects DataFrame, properties DataFrame, error message).
 
 
 
 
 
 
 
 
 
88
  """
 
 
 
89
  try:
90
+ # Validate input: ensure exactly one of image or URL is provided
91
+ if image is None and not url:
92
+ return {"error": "Please provide an image or URL"} if for_json else (None, None, None, None, "Please provide an image or URL")
93
+ if image and url:
94
+ return {"error": "Provide either an image or URL, not both"} if for_json else (None, None, None, None, "Provide either an image or URL, not both")
95
+ if model_name not in VALID_MODELS:
96
+ error_msg = f"Invalid model: {model_name}. Choose from: {VALID_MODELS}"
97
+ return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
98
 
99
+ # Calculate margin threshold: (1 - confidence_threshold) / 2 + confidence_threshold
100
+ margin_threshold = (1 - confidence_threshold) / 2 + confidence_threshold
101
+
102
+ # Load image from URL if provided
103
+ if url:
104
+ response = requests.get(url, timeout=10)
105
+ response.raise_for_status()
106
+ image = Image.open(BytesIO(response.content)).convert("RGB")
107
+
108
+ # Load model and processor thread-safely
109
+ with model_lock:
110
+ if model_name not in models:
111
+ logger.info(f"Loading model: {model_name}")
112
+ try:
113
+ # Select appropriate model and processor based on model name
114
+ if "yolos" in model_name:
115
+ models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
116
+ processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
117
+ elif "panoptic" in model_name:
118
+ models[model_name] = DetrForSegmentation.from_pretrained(model_name)
119
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
120
+ else:
121
+ models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
122
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
123
+ except Exception as e:
124
+ error_msg = f"Failed to load model: {str(e)}"
125
+ logger.error(error_msg)
126
+ return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
127
+ model, processor = models[model_name], processors[model_name]
128
+
129
+ # Prepare image for model processing
130
  inputs = processor(images=image, return_tensors="pt")
131
  with torch.no_grad():
132
  outputs = model(**inputs)
133
 
134
+ # Initialize drawing context for annotations
135
  draw = ImageDraw.Draw(image)
136
  object_names: List[str] = []
137
  confidence_scores: List[float] = []
138
  object_counter = Counter()
139
  target_sizes = torch.tensor([image.size[::-1]])
140
 
141
+ # Process results based on model type (panoptic or object detection)
142
  if "panoptic" in model_name:
143
+ # Handle panoptic segmentation
144
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
145
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
 
146
  for segment in results["segments_info"]:
147
  label = segment["label_id"]
148
  label_name = model.config.id2label.get(label, "Unknown")
149
  score = segment.get("score", 1.0)
 
150
  # Apply segmentation mask if available
151
  if "masks" in results and segment["id"] < len(results["masks"]):
152
  mask = results["masks"][segment["id"]].cpu().numpy()
 
158
  mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
159
  image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
160
  draw = ImageDraw.Draw(image)
161
+ if score > confidence_threshold:
 
162
  object_names.append(label_name)
163
  confidence_scores.append(float(score))
164
  object_counter[label_name] = float(score)
165
  else:
166
+ # Handle object detection
167
  results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
 
168
  for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
169
+ if score > confidence_threshold:
170
  x, y, x2, y2 = box.tolist()
 
171
  label_name = model.config.id2label.get(label.item(), "Unknown")
172
  text = f"{label_name}: {score:.2f}"
173
  text_bbox = draw.textbbox((0, 0), text)
174
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
175
+ # Use yellow for confidence_threshold <= score < margin_threshold, green for >= margin_threshold
176
+ color = "#FFFF00" if score < margin_threshold else "#32CD32"
177
+ draw.rectangle([x, y, x2, y2], outline=color, width=2)
178
+ draw.text((x2 - text_width - 2, y - text_height - 2), text, fill=color)
179
  object_names.append(label_name)
180
  confidence_scores.append(float(score))
181
  object_counter[label_name] = float(score)
182
 
183
+ # Compile unique objects and their highest confidence scores
184
  unique_objects = list(object_counter.keys())
185
  unique_confidences = [object_counter[obj] for obj in unique_objects]
186
 
187
+ # Calculate image properties (metadata)
188
  properties: Dict[str, str] = {
189
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
190
  "Size": f"{image.width}x{image.height}",
191
  "Width": f"{image.width} px",
192
  "Height": f"{image.height} px",
193
  "Mode": image.mode,
194
+ "Aspect Ratio": f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined",
 
 
195
  "File Size": "Unknown",
196
  "Mean (R,G,B)": "Unknown",
197
  "StdDev (R,G,B)": "Unknown",
198
  }
 
 
199
  try:
200
+ # Compute file size
201
  buffered = BytesIO()
202
  image.save(buffered, format="PNG")
203
  properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
204
+ # Compute color statistics
 
 
 
 
205
  stat = ImageStat.Stat(image)
206
  properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
207
  properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
208
  except Exception as e:
209
+ logger.error(f"Error calculating image stats: {str(e)}")
210
 
211
+ # Prepare output based on request type
212
+ if for_json:
213
+ # Return JSON with base64-encoded image
214
+ buffered = BytesIO()
215
+ image.save(buffered, format="PNG")
216
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
217
+ return {
218
+ "image_url": f"data:image/png;base64,{img_base64}",
219
+ "detected_objects": object_names,
220
+ "confidence_scores": confidence_scores,
221
+ "unique_objects": unique_objects,
222
+ "unique_confidence_scores": unique_confidences,
223
+ }
224
+ else:
225
+ # Return tuple for Gradio Home tab with DataFrames
226
+ objects_df = (
227
+ pd.DataFrame({"Object": object_names, "Confidence Score": [f"{score:.2f}" for score in confidence_scores]})
228
+ if object_names else pd.DataFrame(columns=["Object", "Confidence Score"])
229
+ )
230
+ unique_objects_df = (
231
+ pd.DataFrame({"Unique Object": unique_objects, "Confidence Score": [f"{score:.2f}" for score in unique_confidences]})
232
+ if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
233
+ )
234
+ properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
235
+ return image, objects_df, unique_objects_df, properties_df, ""
236
 
237
+ except requests.RequestException as e:
238
+ # Handle URL fetch errors
239
+ error_msg = f"Error fetching image from URL: {str(e)}"
240
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
241
+ return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
242
  except Exception as e:
243
+ # Handle general processing errors
244
+ error_msg = f"Error processing image: {str(e)}"
245
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
246
+ return {"error": error_msg} if for_json else (None, None, None, None, error_msg)
247
 
248
  # ------------------------------
249
  # FastAPI Setup
 
256
  file: Optional[UploadFile] = File(None),
257
  image_url: Optional[str] = Form(None),
258
  model_name: str = Form(VALID_MODELS[0]),
259
+ confidence_threshold: float = Form(CONFIDENCE_THRESHOLD),
260
  ) -> JSONResponse:
261
  """
262
  FastAPI endpoint to detect objects in an image from file upload or URL.
 
265
  file: Uploaded image file (optional).
266
  image_url: URL of the image (optional).
267
  model_name: Model to use for detection (default: first VALID_MODELS entry).
268
+ confidence_threshold: Confidence threshold for detection (default: 0.5).
269
 
270
  Returns:
271
+ JSONResponse with base64-encoded image, detected objects, and confidence scores.
272
 
273
  Raises:
274
+ HTTPException: For invalid inputs or processing errors.
275
  """
276
  try:
277
+ # Validate input: ensure exactly one of file or URL
278
  if (file is None and not image_url) or (file is not None and image_url):
279
+ raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, not both.")
280
+ # Validate confidence threshold
281
+ if not 0 <= confidence_threshold <= 1:
282
+ raise HTTPException(status_code=400, detail="Confidence threshold must be between 0 and 1.")
283
+ # Load image from file if provided
284
+ image = None
285
  if file:
286
  if not file.content_type.startswith("image/"):
287
  raise HTTPException(status_code=400, detail="File must be an image")
288
  contents = await file.read()
289
  image = Image.open(BytesIO(contents)).convert("RGB")
290
+ # Process image with specified parameters
291
+ result = process_image(image, image_url, model_name, for_json=True, confidence_threshold=confidence_threshold)
292
+ if "error" in result:
293
+ raise HTTPException(status_code=400, detail=result["error"])
294
+ return JSONResponse(content=result)
295
+ except HTTPException:
296
+ raise
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
297
  except Exception as e:
298
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
299
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
 
304
 
305
  def create_gradio_ui() -> gr.Blocks:
306
  """
307
+ Create and configure the Gradio UI for object detection with Home, JSON, and Help tabs.
308
 
309
  Returns:
310
  Gradio Blocks object representing the UI.
 
313
  RuntimeError: If UI creation fails.
314
  """
315
  try:
316
+ # Initialize Gradio Blocks with a custom theme
317
+ with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
318
+ # Display app header
319
  gr.Markdown(
320
  f"""
321
  # 🚀 Object Detection App
322
+ Upload an image or provide a URL to detect objects using transformer models (DETR, YOLOS).
323
  Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
324
  """
325
  )
326
 
327
+ # Create tabbed interface
328
  with gr.Tabs():
329
+ # Home tab (formerly Image Upload)
330
+ with gr.Tab("🏠 Home"):
331
  with gr.Row():
332
+ # Left column for inputs
333
  with gr.Column(scale=1):
334
  gr.Markdown("### Input")
335
+ # Model selection dropdown
336
+ model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
337
+ model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
338
+ # Image upload input
 
 
 
 
 
 
339
  image_input = gr.Image(type="pil", label="📷 Upload Image")
340
+ # Image URL input
341
+ image_url_input = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
342
+ # Buttons for submission and clearing
 
343
  with gr.Row():
344
  submit_btn = gr.Button("✨ Detect", variant="primary")
345
  clear_btn = gr.Button("🗑️ Clear", variant="secondary")
346
 
347
+ # Update model info when model changes
348
  model_choice.change(
349
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
 
 
350
  inputs=model_choice,
351
  outputs=model_info,
352
  )
353
 
354
+ # Right column for results
355
  with gr.Column(scale=2):
356
  gr.Markdown("### Results")
357
+ # Error display (hidden by default)
358
+ error_output = gr.Textbox(label="⚠️ Errors", visible=False, lines=3, max_lines=5)
359
+ # Annotated image output
360
+ output_image = gr.Image(type="pil", label="🎯 Detected Image", interactive=False)
361
+ # Detected and unique objects tables
 
 
 
 
 
 
362
  with gr.Row():
363
+ objects_output = gr.DataFrame(label="📋 Detected Objects", interactive=False)
364
+ unique_objects_output = gr.DataFrame(label="🔍 Unique Objects", interactive=False)
365
+ # Image properties table
366
+ properties_output = gr.DataFrame(label="📄 Image Properties", interactive=False)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
367
 
368
+ # Process image when Detect button is clicked
369
  submit_btn.click(
370
+ fn=process_image,
371
  inputs=[image_input, image_url_input, model_choice],
372
  outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
373
  )
374
 
375
+ # Clear all inputs and outputs
376
  clear_btn.click(
377
  fn=lambda: [None, "", None, None, None, None],
378
  inputs=None,
379
+ outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output],
 
 
 
 
 
 
 
 
380
  )
381
 
382
+ # JSON tab for API-like output
383
+ with gr.Tab("🔗 JSON"):
384
+ with gr.Row():
385
+ # Left column for inputs
386
+ with gr.Column(scale=1):
387
+ gr.Markdown("### Process Image for JSON")
388
+ # Model selection dropdown
389
+ url_model_choice = gr.Dropdown(choices=VALID_MODELS, value=VALID_MODELS[0], label="🔎 Select Model")
390
+ url_model_info = gr.Markdown(f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}")
391
+ # Image upload input
392
+ image_input_json = gr.Image(type="pil", label="📷 Upload Image")
393
+ # Image URL input
394
+ image_url_input_json = gr.Textbox(label="🔗 Image URL", placeholder="https://example.com/image.jpg")
395
+ # Process button
396
+ url_submit_btn = gr.Button("🔄 Process", variant="primary")
397
+
398
+ # Update model info when model changes
399
+ url_model_choice.change(
400
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
401
+ inputs=url_model_choice,
402
+ outputs=url_model_info,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
403
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
404
 
405
+ # Right column for JSON output
406
+ with gr.Column(scale=1):
407
+ # JSON output display
408
+ url_output = gr.JSON(label="API Response")
409
+
410
+ # Process image and return JSON when Process button is clicked
411
  url_submit_btn.click(
412
+ fn=lambda img, url, model: process_image(img, url, model, for_json=True),
413
  inputs=[image_input_json, image_url_input_json, url_model_choice],
414
  outputs=[url_output],
415
  )
416
 
417
+ # Help tab with usage instructions
418
  with gr.Tab("ℹ️ Help"):
419
  gr.Markdown(
420
  """
421
  ## How to Use
422
+ - **Home**: Select a model, upload an image or provide a URL, click "Detect" to see results.
423
+ - **JSON**: Select a model, upload an image or enter a URL, click "Process" for JSON output.
424
+ - **Models**: Choose DETR (detection or panoptic) or YOLOS (lightweight detection).
425
+ - **Clear**: Reset inputs/outputs in Home tab.
426
+ - **Errors**: Check error box (Home) or JSON response (JSON) for issues.
427
+
428
  ## Tips
429
+ - Use high-quality images for better results.
430
+ - Panoptic models provide segmentation masks for complex scenes.
431
+ - YOLOS-Tiny is faster for resource-constrained devices.
432
  """
433
  )
434
 
435
+ return demo
436
 
437
  except Exception as e:
438
  logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
 
444
 
445
  def parse_args() -> argparse.Namespace:
446
  """
447
+ Parse command-line arguments for configuring the application.
448
 
449
  Returns:
450
  Parsed arguments as a Namespace object.
 
 
 
451
  """
452
+ parser = argparse.ArgumentParser(description="Object Detection App with Gradio and FastAPI.")
453
+ # Gradio port argument
454
+ parser.add_argument("--gradio-port", type=int, default=DEFAULT_GRADIO_PORT, help=f"Gradio port (default: {DEFAULT_GRADIO_PORT}).")
455
+ # FastAPI enable flag
456
+ parser.add_argument("--enable-fastapi", action="store_true", help="Enable FastAPI server.")
457
+ # FastAPI port argument
458
+ parser.add_argument("--fastapi-port", type=int, default=DEFAULT_FASTAPI_PORT, help=f"FastAPI port (default: {DEFAULT_FASTAPI_PORT}).")
459
+ # Confidence threshold argument
460
+ parser.add_argument("--confidence-threshold", type=float, default=CONFIDENCE_THRESHOLD, help="Confidence threshold for detection (default: 0.5).")
461
+ # Parse known arguments, ignoring unrecognized ones
 
 
 
 
 
 
 
 
 
 
 
 
 
462
  args, _ = parser.parse_known_args()
463
+ # Validate confidence threshold
464
+ if not 0 <= args.confidence_threshold <= 1:
465
+ parser.error("Confidence threshold must be between 0 and 1.")
466
  return args
467
 
468
  def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
 
470
  Find an available port within the specified range.
471
 
472
  Args:
473
+ start_port: Initial port to try.
474
  port_range: Range of ports to attempt.
475
  max_attempts: Maximum number of ports to try.
476
 
477
  Returns:
478
  Available port number, or None if no port is found.
 
 
 
479
  """
480
  import socket
481
+ # Check environment variable for port override
482
+ port = int(os.getenv("GRADIO_SERVER_PORT", start_port))
483
  attempts = 0
 
 
 
 
 
 
 
484
  while attempts < max_attempts:
485
  with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
486
  try:
487
+ # Attempt to bind to the port
488
  s.bind(("0.0.0.0", port))
489
  logger.debug(f"Port {port} is available")
490
  return port
 
495
  attempts += 1
496
  else:
497
  raise
498
+ logger.error(f"No available port in range {min(port_range)}-{max(port_range)}")
 
 
 
499
  return None
500
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
501
  def main() -> None:
502
  """
503
+ Launch the Gradio UI and optional FastAPI server.
504
 
505
  Raises:
506
+ SystemExit: On interruption or critical errors.
507
  """
508
  try:
509
+ # Apply nest_asyncio for compatibility with Jupyter/Colab
510
  nest_asyncio.apply()
 
511
  # Parse command-line arguments
512
  args = parse_args()
513
  logger.info(f"Parsed arguments: {args}")
 
514
  # Find available port for Gradio
515
  gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
516
  if gradio_port is None:
517
  logger.error("Failed to find an available port for Gradio UI")
518
  sys.exit(1)
519
 
520
+ # Start FastAPI server in a thread if enabled
521
  if args.enable_fastapi:
522
+ logger.info(f"Starting FastAPI on port {args.fastapi_port}")
523
  fastapi_thread = threading.Thread(
524
+ target=lambda: uvicorn.run(app, host="0.0.0.0", port=args.fastapi_port),
 
525
  daemon=True
526
  )
527
  fastapi_thread.start()
528
 
529
  # Launch Gradio UI
530
  logger.info(f"Starting Gradio UI on port {gradio_port}")
531
+ demo = create_gradio_ui()
532
+ demo.launch(server_port=gradio_port, server_name="0.0.0.0")
533
 
534
  except KeyboardInterrupt:
535
  logger.info("Application terminated by user.")
536
  sys.exit(0)
 
 
 
537
  except Exception as e:
538
+ logger.error(f"Error: {str(e)}\n{traceback.format_exc()}")
539
  sys.exit(1)
540
 
541
  if __name__ == "__main__":
hf_space/app.py CHANGED
@@ -1,79 +1,166 @@
1
- import gradio as gr
2
- import torch
3
- from transformers import DetrImageProcessor, DetrForObjectDetection
4
- from transformers import YolosImageProcessor, YolosForObjectDetection
5
- from transformers import DetrForSegmentation
6
- from PIL import Image, ImageDraw, ImageStat
7
- import requests
8
- from io import BytesIO
9
  import base64
10
- from collections import Counter
11
  import logging
12
- from fastapi import FastAPI, File, UploadFile, HTTPException, Form
13
- from fastapi.responses import JSONResponse
14
- import uvicorn
15
- import pandas as pd
16
- import traceback
17
  import os
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- # Set up logging
20
- logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
 
 
 
21
  logger = logging.getLogger(__name__)
22
 
23
- # Constants
24
- CONFIDENCE_THRESHOLD = 0.5
25
- VALID_MODELS = [
26
  "facebook/detr-resnet-50",
27
  "facebook/detr-resnet-101",
28
  "facebook/detr-resnet-50-panoptic",
29
  "facebook/detr-resnet-101-panoptic",
30
  "hustvl/yolos-tiny",
31
- "hustvl/yolos-base"
32
  ]
33
- MODEL_DESCRIPTIONS = {
34
- "facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
35
- "facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
36
- "facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
37
- "facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
38
- "hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
39
- "hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
 
 
 
 
 
 
 
 
 
 
 
 
40
  }
41
 
42
- # Lazy model loading
43
- models = {}
44
- processors = {}
 
 
45
 
46
- def process(image, model_name):
47
- """Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
48
- try:
49
- if model_name not in VALID_MODELS:
50
- raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
51
 
52
- # Load model and processor
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  if model_name not in models:
54
  logger.info(f"Loading model: {model_name}")
55
- if "yolos" in model_name:
56
- models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
57
- processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
58
- elif "panoptic" in model_name:
59
- models[model_name] = DetrForSegmentation.from_pretrained(model_name)
60
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
61
- else:
62
- models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
63
- processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
64
-
65
- model, processor = models[model_name], processors[model_name]
66
- inputs = processor(images=image, return_tensors="pt")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  with torch.no_grad():
69
  outputs = model(**inputs)
70
 
71
- target_sizes = torch.tensor([image.size[::-1]])
72
  draw = ImageDraw.Draw(image)
73
- object_names = []
74
- confidence_scores = []
75
  object_counter = Counter()
 
76
 
 
77
  if "panoptic" in model_name:
78
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
79
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
@@ -83,6 +170,7 @@ def process(image, model_name):
83
  label_name = model.config.id2label.get(label, "Unknown")
84
  score = segment.get("score", 1.0)
85
 
 
86
  if "masks" in results and segment["id"] < len(results["masks"]):
87
  mask = results["masks"][segment["id"]].cpu().numpy()
88
  if mask.shape[0] > 0 and mask.shape[1] > 0:
@@ -106,7 +194,6 @@ def process(image, model_name):
106
  x, y, x2, y2 = box.tolist()
107
  draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
108
  label_name = model.config.id2label.get(label.item(), "Unknown")
109
- # Place text at top-right corner, outside the box, with smaller size
110
  text = f"{label_name}: {score:.2f}"
111
  text_bbox = draw.textbbox((0, 0), text)
112
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
@@ -115,58 +202,82 @@ def process(image, model_name):
115
  confidence_scores.append(float(score))
116
  object_counter[label_name] = float(score)
117
 
 
118
  unique_objects = list(object_counter.keys())
119
  unique_confidences = [object_counter[obj] for obj in unique_objects]
120
 
121
- # Image properties
122
- file_size = "Unknown"
123
- if hasattr(image, "fp") and image.fp is not None:
124
- buffered = BytesIO()
125
- image.save(buffered, format="PNG")
126
- file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
127
-
128
- # Color statistics
129
- try:
130
- stat = ImageStat.Stat(image)
131
- color_stats = {
132
- "mean": [f"{m:.2f}" for m in stat.mean],
133
- "stddev": [f"{s:.2f}" for s in stat.stddev]
134
- }
135
- except Exception as e:
136
- logger.error(f"Error calculating color statistics: {str(e)}")
137
- color_stats = {"mean": "Error", "stddev": "Error"}
138
-
139
- properties = {
140
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
141
  "Size": f"{image.width}x{image.height}",
142
  "Width": f"{image.width} px",
143
  "Height": f"{image.height} px",
144
  "Mode": image.mode,
145
- "Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
146
- "File Size": file_size,
147
- "Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
148
- "StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
 
 
149
  }
150
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
151
  return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
 
152
  except Exception as e:
153
  logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
154
- raise
155
 
 
156
  # FastAPI Setup
 
 
157
  app = FastAPI(title="Object Detection API")
158
 
159
  @app.post("/detect")
160
  async def detect_objects_endpoint(
161
- file: UploadFile = File(None),
162
- image_url: str = Form(None),
163
- model_name: str = Form(VALID_MODELS[0])
164
- ):
165
- """FastAPI endpoint to detect objects in an image from file or URL."""
 
 
 
 
 
 
 
 
 
 
 
 
 
166
  try:
 
167
  if (file is None and not image_url) or (file is not None and image_url):
168
- raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
 
 
 
169
 
 
170
  if file:
171
  if not file.content_type.startswith("image/"):
172
  raise HTTPException(status_code=400, detail="File must be an image")
@@ -178,207 +289,454 @@ async def detect_objects_endpoint(
178
  image = Image.open(BytesIO(response.content)).convert("RGB")
179
 
180
  if model_name not in VALID_MODELS:
181
- raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
 
 
 
182
 
183
- detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
 
 
 
184
 
 
185
  buffered = BytesIO()
186
  detected_image.save(buffered, format="PNG")
187
  img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
188
  img_url = f"data:image/png;base64,{img_base64}"
189
 
190
- return JSONResponse(content={
191
- "image_url": img_url,
192
- "detected_objects": detected_objects,
193
- "confidence_scores": detected_confidences,
194
- "unique_objects": unique_objects,
195
- "unique_confidence_scores": unique_confidences
196
- })
 
 
 
 
 
 
197
  except Exception as e:
198
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
199
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
200
 
201
- # Gradio UI
202
- def create_gradio_ui():
203
- with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
204
- gr.Markdown(
205
- """
206
- # 🚀 Object Detection App
207
- Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
208
- """
209
- )
210
-
211
- with gr.Tabs():
212
- with gr.Tab("📷 Image Upload"):
213
- with gr.Row():
214
- with gr.Column(scale=1):
215
- gr.Markdown("### Input")
216
- model_choice = gr.Dropdown(
217
- choices=VALID_MODELS,
218
- value=VALID_MODELS[0],
219
- label="🔎 Select Model",
220
- info="Choose a model for object detection or panoptic segmentation."
221
- )
222
- model_info = gr.Markdown(
223
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
224
- visible=True
225
- )
226
- image_input = gr.Image(type="pil", label="📷 Upload Image")
227
- image_url_input = gr.Textbox(
228
- label="🔗 Image URL",
229
- placeholder="https://example.com/image.jpg"
230
- )
231
- with gr.Row():
232
- submit_btn = gr.Button("✨ Detect", variant="primary")
233
- clear_btn = gr.Button("🗑️ Clear", variant="secondary")
234
-
235
- model_choice.change(
236
- fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
237
- inputs=model_choice,
238
- outputs=model_info
239
- )
240
-
241
- with gr.Column(scale=2):
242
- gr.Markdown("### Results")
243
- error_output = gr.Textbox(
244
- label="⚠️ Errors",
245
- visible=False,
246
- lines=3,
247
- max_lines=5
248
- )
249
- output_image = gr.Image(
250
- type="pil",
251
- label="🎯 Detected Image",
252
- interactive=False
253
- )
254
- with gr.Row():
255
- objects_output = gr.DataFrame(
256
- label="📋 Detected Objects",
 
 
 
 
 
 
 
 
 
 
 
257
  interactive=False,
258
- value=None
259
  )
260
- unique_objects_output = gr.DataFrame(
261
- label="🔍 Unique Objects",
 
 
 
 
 
 
 
 
 
 
 
262
  interactive=False,
263
- value=None
264
  )
265
- properties_output = gr.DataFrame(
266
- label="📄 Image Properties",
267
- interactive=False,
268
- value=None
269
- )
270
-
271
- def process_for_gradio(image, url, model_name):
272
- try:
273
- if image is None and not url:
274
- return None, None, None, None, "Please provide an image or URL"
275
- if image and url:
276
- return None, None, None, None, "Please provide either an image or URL, not both"
277
-
278
- if url:
279
- response = requests.get(url, timeout=10)
280
- response.raise_for_status()
281
- image = Image.open(BytesIO(response.content)).convert("RGB")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
282
 
283
- detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
284
- objects_df = pd.DataFrame({
285
- "Object": objects,
286
- "Confidence Score": [f"{score:.2f}" for score in scores]
287
- }) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
288
- unique_objects_df = pd.DataFrame({
289
- "Unique Object": unique_objects,
290
- "Confidence Score": [f"{score:.2f}" for score in unique_scores]
291
- }) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
292
- properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
293
- return detected_image, objects_df, unique_objects_df, properties_df, ""
294
- except Exception as e:
295
- error_msg = f"Error processing image: {str(e)}"
296
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
297
- return None, None, None, None, error_msg
298
-
299
- submit_btn.click(
300
- fn=process_for_gradio,
301
- inputs=[image_input, image_url_input, model_choice],
302
- outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
303
- )
304
-
305
- clear_btn.click(
306
- fn=lambda: [None, "", None, None, None, None],
307
- inputs=None,
308
- outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
309
- )
310
-
311
- with gr.Tab("🔗 URL Input"):
312
- gr.Markdown("### Process Image from URL")
313
- image_url_input = gr.Textbox(
314
- label="🔗 Image URL",
315
- placeholder="https://example.com/image.jpg"
316
- )
317
- url_model_choice = gr.Dropdown(
318
- choices=VALID_MODELS,
319
- value=VALID_MODELS[0],
320
- label="🔎 Select Model"
321
- )
322
- url_model_info = gr.Markdown(
323
- f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
324
- visible=True
325
- )
326
- url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
327
- url_output = gr.JSON(label="API Response")
328
-
329
- url_model_choice.change(
330
- fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
331
- inputs=url_model_choice,
332
- outputs=url_model_info
333
- )
334
-
335
- def process_url_for_gradio(url, model_name):
336
- try:
337
- response = requests.get(url, timeout=10)
338
- response.raise_for_status()
339
- image = Image.open(BytesIO(response.content)).convert("RGB")
340
- detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
341
- buffered = BytesIO()
342
- detected_image.save(buffered, format="PNG")
343
- img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
344
- return {
345
- "image_url": f"data:image/png;base64,{img_base64}",
346
- "detected_objects": objects,
347
- "confidence_scores": scores,
348
- "unique_objects": unique_objects,
349
- "unique_confidence_scores": unique_scores
350
- }
351
- except Exception as e:
352
- error_msg = f"Error processing URL: {str(e)}"
353
- logger.error(f"{error_msg}\n{traceback.format_exc()}")
354
- return {"error": error_msg}
355
-
356
- url_submit_btn.click(
357
- fn=process_url_for_gradio,
358
- inputs=[image_url_input, url_model_choice],
359
- outputs=[url_output]
360
- )
361
-
362
- with gr.Tab("ℹ️ Help"):
363
- gr.Markdown(
364
- """
365
- ## How to Use
366
- - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
367
- - **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
368
- - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
369
- - **Clear**: Reset all inputs and outputs using the "Clear" button.
370
- - **Errors**: Check the error box for any processing issues.
371
-
372
- ## Tips
373
- - Use high-quality images for better detection results.
374
- - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
375
- - For faster processing, try YOLOS-Tiny on resource-constrained devices.
376
- """
377
- )
378
-
379
- return demo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
380
 
381
  if __name__ == "__main__":
382
- demo = create_gradio_ui()
383
- demo.launch()
384
- # To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
 
1
+ import argparse
 
 
 
 
 
 
 
2
  import base64
 
3
  import logging
 
 
 
 
 
4
  import os
5
+ import sys
6
+ import traceback
7
+ import threading
8
+ from collections import Counter
9
+ from io import BytesIO
10
+ from typing import Dict, List, Optional, Tuple
11
+
12
+ import gradio as gr
13
+ import pandas as pd
14
+ import requests
15
+ import torch
16
+ import uvicorn
17
+ from fastapi import FastAPI, File, Form, HTTPException, UploadFile
18
+ from fastapi.responses import JSONResponse
19
+ from PIL import Image, ImageDraw, ImageStat
20
+ from transformers import (
21
+ DetrForObjectDetection,
22
+ DetrForSegmentation,
23
+ DetrImageProcessor,
24
+ YolosForObjectDetection,
25
+ YolosImageProcessor,
26
+ )
27
+ import nest_asyncio
28
+
29
+ # ------------------------------
30
+ # Configuration
31
+ # ------------------------------
32
 
33
+ # Logging configuration
34
+ logging.basicConfig(
35
+ level=logging.INFO,
36
+ format="%(asctime)s - %(levelname)s - %(message)s",
37
+ )
38
  logger = logging.getLogger(__name__)
39
 
40
+ # Model and processing constants
41
+ CONFIDENCE_THRESHOLD: float = 0.5
42
+ VALID_MODELS: List[str] = [
43
  "facebook/detr-resnet-50",
44
  "facebook/detr-resnet-101",
45
  "facebook/detr-resnet-50-panoptic",
46
  "facebook/detr-resnet-101-panoptic",
47
  "hustvl/yolos-tiny",
48
+ "hustvl/yolos-base",
49
  ]
50
+ MODEL_DESCRIPTIONS: Dict[str, str] = {
51
+ "facebook/detr-resnet-50": (
52
+ "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use."
53
+ ),
54
+ "facebook/detr-resnet-101": (
55
+ "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50."
56
+ ),
57
+ "facebook/detr-resnet-50-panoptic": (
58
+ "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes."
59
+ ),
60
+ "facebook/detr-resnet-101-panoptic": (
61
+ "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes."
62
+ ),
63
+ "hustvl/yolos-tiny": (
64
+ "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments."
65
+ ),
66
+ "hustvl/yolos-base": (
67
+ "YOLOS Base model. Balances speed and accuracy for object detection."
68
+ ),
69
  }
70
 
71
+ # Port configuration
72
+ DEFAULT_GRADIO_PORT: int = 7860
73
+ DEFAULT_FASTAPI_PORT: int = 8000
74
+ PORT_RANGE: range = range(7860, 7870) # Try ports 7860-7869
75
+ MAX_PORT_ATTEMPTS: int = 10
76
 
77
+ # Thread-safe storage for lazy-loaded models and processors
78
+ models: Dict[str, any] = {}
79
+ processors: Dict[str, any] = {}
80
+ model_lock = threading.Lock()
 
81
 
82
+ # ------------------------------
83
+ # Model Loading
84
+ # ------------------------------
85
+
86
+ def load_model_and_processor(model_name: str) -> Tuple[any, any]:
87
+ """
88
+ Load and cache the specified model and processor thread-safely.
89
+
90
+ Args:
91
+ model_name: Name of the model to load (must be in VALID_MODELS).
92
+
93
+ Returns:
94
+ Tuple containing the loaded model and processor.
95
+
96
+ Raises:
97
+ ValueError: If the model_name is invalid or loading fails.
98
+ """
99
+ with model_lock:
100
  if model_name not in models:
101
  logger.info(f"Loading model: {model_name}")
102
+ try:
103
+ if "yolos" in model_name:
104
+ models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
105
+ processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
106
+ elif "panoptic" in model_name:
107
+ models[model_name] = DetrForSegmentation.from_pretrained(model_name)
108
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
109
+ else:
110
+ models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
111
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
112
+ logger.debug(f"Model {model_name} loaded successfully")
113
+ except Exception as e:
114
+ logger.error(f"Failed to load model {model_name}: {str(e)}")
115
+ raise ValueError(f"Failed to load model: {str(e)}")
116
+ return models[model_name], processors[model_name]
117
+
118
+ # ------------------------------
119
+ # Image Processing
120
+ # ------------------------------
121
+
122
+ def process(image: Image.Image, model_name: str) -> Tuple[Image.Image, List[str], List[float], List[str], List[float], Dict[str, str]]:
123
+ """
124
+ Process an image for object detection or panoptic segmentation.
125
+
126
+ Args:
127
+ image: PIL Image to process.
128
+ model_name: Name of the model to use (must be in VALID_MODELS).
129
+
130
+ Returns:
131
+ Tuple containing:
132
+ - Annotated image (PIL Image).
133
+ - List of detected object names.
134
+ - List of confidence scores for detected objects.
135
+ - List of unique object names.
136
+ - List of confidence scores for unique objects.
137
+ - Dictionary of image properties (format, size, etc.).
138
 
139
+ Raises:
140
+ ValueError: If the model_name is invalid.
141
+ RuntimeError: If processing fails due to model or image issues.
142
+ """
143
+ if model_name not in VALID_MODELS:
144
+ raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
145
+
146
+ try:
147
+ # Load model and processor
148
+ model, processor = load_model_and_processor(model_name)
149
+ logger.debug(f"Processing image with model: {model_name}")
150
+
151
+ # Prepare image for processing
152
+ inputs = processor(images=image, return_tensors="pt")
153
  with torch.no_grad():
154
  outputs = model(**inputs)
155
 
156
+ # Initialize drawing context
157
  draw = ImageDraw.Draw(image)
158
+ object_names: List[str] = []
159
+ confidence_scores: List[float] = []
160
  object_counter = Counter()
161
+ target_sizes = torch.tensor([image.size[::-1]])
162
 
163
+ # Process panoptic segmentation or object detection
164
  if "panoptic" in model_name:
165
  processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
166
  results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
 
170
  label_name = model.config.id2label.get(label, "Unknown")
171
  score = segment.get("score", 1.0)
172
 
173
+ # Apply segmentation mask if available
174
  if "masks" in results and segment["id"] < len(results["masks"]):
175
  mask = results["masks"][segment["id"]].cpu().numpy()
176
  if mask.shape[0] > 0 and mask.shape[1] > 0:
 
194
  x, y, x2, y2 = box.tolist()
195
  draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
196
  label_name = model.config.id2label.get(label.item(), "Unknown")
 
197
  text = f"{label_name}: {score:.2f}"
198
  text_bbox = draw.textbbox((0, 0), text)
199
  text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
 
202
  confidence_scores.append(float(score))
203
  object_counter[label_name] = float(score)
204
 
205
+ # Compile unique objects and confidences
206
  unique_objects = list(object_counter.keys())
207
  unique_confidences = [object_counter[obj] for obj in unique_objects]
208
 
209
+ # Calculate image properties
210
+ properties: Dict[str, str] = {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
211
  "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
212
  "Size": f"{image.width}x{image.height}",
213
  "Width": f"{image.width} px",
214
  "Height": f"{image.height} px",
215
  "Mode": image.mode,
216
+ "Aspect Ratio": (
217
+ f"{round(image.width / image.height, 2)}" if image.height != 0 else "Undefined"
218
+ ),
219
+ "File Size": "Unknown",
220
+ "Mean (R,G,B)": "Unknown",
221
+ "StdDev (R,G,B)": "Unknown",
222
  }
223
 
224
+ # Compute file size
225
+ try:
226
+ buffered = BytesIO()
227
+ image.save(buffered, format="PNG")
228
+ properties["File Size"] = f"{len(buffered.getvalue()) / 1024:.2f} KB"
229
+ except Exception as e:
230
+ logger.error(f"Error calculating file size: {str(e)}")
231
+
232
+ # Compute color statistics
233
+ try:
234
+ stat = ImageStat.Stat(image)
235
+ properties["Mean (R,G,B)"] = ", ".join(f"{m:.2f}" for m in stat.mean)
236
+ properties["StdDev (R,G,B)"] = ", ".join(f"{s:.2f}" for s in stat.stddev)
237
+ except Exception as e:
238
+ logger.error(f"Error calculating color statistics: {str(e)}")
239
+
240
  return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
241
+
242
  except Exception as e:
243
  logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
244
+ raise RuntimeError(f"Failed to process image: {str(e)}")
245
 
246
+ # ------------------------------
247
  # FastAPI Setup
248
+ # ------------------------------
249
+
250
  app = FastAPI(title="Object Detection API")
251
 
252
  @app.post("/detect")
253
  async def detect_objects_endpoint(
254
+ file: Optional[UploadFile] = File(None),
255
+ image_url: Optional[str] = Form(None),
256
+ model_name: str = Form(VALID_MODELS[0]),
257
+ ) -> JSONResponse:
258
+ """
259
+ FastAPI endpoint to detect objects in an image from file upload or URL.
260
+
261
+ Args:
262
+ file: Uploaded image file (optional).
263
+ image_url: URL of the image (optional).
264
+ model_name: Model to use for detection (default: first VALID_MODELS entry).
265
+
266
+ Returns:
267
+ JSONResponse containing the processed image (base64), detected objects, and confidences.
268
+
269
+ Raises:
270
+ HTTPException: If input validation fails or processing errors occur.
271
+ """
272
  try:
273
+ # Validate input
274
  if (file is None and not image_url) or (file is not None and image_url):
275
+ raise HTTPException(
276
+ status_code=400,
277
+ detail="Provide either an image file or an image URL, not both.",
278
+ )
279
 
280
+ # Load image
281
  if file:
282
  if not file.content_type.startswith("image/"):
283
  raise HTTPException(status_code=400, detail="File must be an image")
 
289
  image = Image.open(BytesIO(response.content)).convert("RGB")
290
 
291
  if model_name not in VALID_MODELS:
292
+ raise HTTPException(
293
+ status_code=400,
294
+ detail=f"Invalid model. Choose from: {VALID_MODELS}",
295
+ )
296
 
297
+ # Process image
298
+ detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(
299
+ image, model_name
300
+ )
301
 
302
+ # Encode image as base64
303
  buffered = BytesIO()
304
  detected_image.save(buffered, format="PNG")
305
  img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
306
  img_url = f"data:image/png;base64,{img_base64}"
307
 
308
+ return JSONResponse(
309
+ content={
310
+ "image_url": img_url,
311
+ "detected_objects": detected_objects,
312
+ "confidence_scores": detected_confidences,
313
+ "unique_objects": unique_objects,
314
+ "unique_confidence_scores": unique_confidences,
315
+ }
316
+ )
317
+
318
+ except requests.RequestException as e:
319
+ logger.error(f"Error fetching image from URL: {str(e)}")
320
+ raise HTTPException(status_code=400, detail=f"Failed to fetch image: {str(e)}")
321
  except Exception as e:
322
  logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
323
  raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
324
 
325
+ # ------------------------------
326
+ # Gradio UI Setup
327
+ # ------------------------------
328
+
329
+ def create_gradio_ui() -> gr.Blocks:
330
+ """
331
+ Create and configure the Gradio UI for object detection.
332
+
333
+ Returns:
334
+ Gradio Blocks object representing the UI.
335
+
336
+ Raises:
337
+ RuntimeError: If UI creation fails.
338
+ """
339
+ try:
340
+ with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as app:
341
+ gr.Markdown(
342
+ f"""
343
+ # 🚀 Object Detection App
344
+ Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
345
+ Running on port: {os.getenv('GRADIO_SERVER_PORT', 'auto-selected')}
346
+ """
347
+ )
348
+
349
+ with gr.Tabs():
350
+ with gr.Tab("📷 Image Upload"):
351
+ with gr.Row():
352
+ with gr.Column(scale=1):
353
+ gr.Markdown("### Input")
354
+ model_choice = gr.Dropdown(
355
+ choices=VALID_MODELS,
356
+ value=VALID_MODELS[0],
357
+ label="🔎 Select Model",
358
+ info="Choose a model for object detection or panoptic segmentation.",
359
+ )
360
+ model_info = gr.Markdown(
361
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
362
+ visible=True,
363
+ )
364
+ image_input = gr.Image(type="pil", label="📷 Upload Image")
365
+ image_url_input = gr.Textbox(
366
+ label="🔗 Image URL",
367
+ placeholder="https://example.com/image.jpg",
368
+ )
369
+ with gr.Row():
370
+ submit_btn = gr.Button("✨ Detect", variant="primary")
371
+ clear_btn = gr.Button("🗑️ Clear", variant="secondary")
372
+
373
+ model_choice.change(
374
+ fn=lambda model_name: (
375
+ f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
376
+ ),
377
+ inputs=model_choice,
378
+ outputs=model_info,
379
+ )
380
+
381
+ with gr.Column(scale=2):
382
+ gr.Markdown("### Results")
383
+ error_output = gr.Textbox(
384
+ label="⚠️ Errors",
385
+ visible=False,
386
+ lines=3,
387
+ max_lines=5,
388
+ )
389
+ output_image = gr.Image(
390
+ type="pil",
391
+ label="🎯 Detected Image",
392
  interactive=False,
 
393
  )
394
+ with gr.Row():
395
+ objects_output = gr.DataFrame(
396
+ label="📋 Detected Objects",
397
+ interactive=False,
398
+ value=None,
399
+ )
400
+ unique_objects_output = gr.DataFrame(
401
+ label="🔍 Unique Objects",
402
+ interactive=False,
403
+ value=None,
404
+ )
405
+ properties_output = gr.DataFrame(
406
+ label="📄 Image Properties",
407
  interactive=False,
408
+ value=None,
409
  )
410
+
411
+ def process_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Tuple[
412
+ Optional[Image.Image], Optional[pd.DataFrame], Optional[pd.DataFrame], Optional[pd.DataFrame], str
413
+ ]:
414
+ """
415
+ Process image for Gradio UI and return results.
416
+
417
+ Args:
418
+ image: Uploaded PIL Image (optional).
419
+ url: Image URL (optional).
420
+ model_name: Model to use for detection.
421
+
422
+ Returns:
423
+ Tuple of detected image, objects DataFrame, unique objects DataFrame, properties DataFrame, and error message.
424
+ """
425
+ try:
426
+ if image is None and not url:
427
+ return None, None, None, None, "Please provide an image or URL"
428
+ if image and url:
429
+ return None, None, None, None, "Please provide either an image or URL, not both"
430
+
431
+ if url:
432
+ response = requests.get(url, timeout=10)
433
+ response.raise_for_status()
434
+ image = Image.open(BytesIO(response.content)).convert("RGB")
435
+
436
+ detected_image, objects, scores, unique_objects, unique_scores, properties = process(
437
+ image, model_name
438
+ )
439
+ objects_df = (
440
+ pd.DataFrame(
441
+ {
442
+ "Object": objects,
443
+ "Confidence Score": [f"{score:.2f}" for score in scores],
444
+ }
445
+ )
446
+ if objects
447
+ else pd.DataFrame(columns=["Object", "Confidence Score"])
448
+ )
449
+ unique_objects_df = (
450
+ pd.DataFrame(
451
+ {
452
+ "Unique Object": unique_objects,
453
+ "Confidence Score": [f"{score:.2f}" for score in unique_scores],
454
+ }
455
+ )
456
+ if unique_objects
457
+ else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
458
+ )
459
+ properties_df = (
460
+ pd.DataFrame([properties])
461
+ if properties
462
+ else pd.DataFrame(columns=properties.keys())
463
+ )
464
+ return detected_image, objects_df, unique_objects_df, properties_df, ""
465
+
466
+ except requests.RequestException as e:
467
+ error_msg = f"Error fetching image from URL: {str(e)}"
468
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
469
+ return None, None, None, None, error_msg
470
+ except Exception as e:
471
+ error_msg = f"Error processing image: {str(e)}"
472
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
473
+ return None, None, None, None, error_msg
474
+
475
+ submit_btn.click(
476
+ fn=process_for_gradio,
477
+ inputs=[image_input, image_url_input, model_choice],
478
+ outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output],
479
+ )
480
+
481
+ clear_btn.click(
482
+ fn=lambda: [None, "", None, None, None, None],
483
+ inputs=None,
484
+ outputs=[
485
+ image_input,
486
+ image_url_input,
487
+ output_image,
488
+ objects_output,
489
+ unique_objects_output,
490
+ properties_output,
491
+ error_output,
492
+ ],
493
+ )
494
+
495
+ with gr.Tab("🔗 JSON Output"):
496
+ gr.Markdown("### Process Image for JSON Output")
497
+ image_input_json = gr.Image(type="pil", label="📷 Upload Image")
498
+ image_url_input_json = gr.Textbox(
499
+ label="🔗 Image URL",
500
+ placeholder="https://example.com/image.jpg",
501
+ )
502
+ url_model_choice = gr.Dropdown(
503
+ choices=VALID_MODELS,
504
+ value=VALID_MODELS[0],
505
+ label="🔎 Select Model",
506
+ )
507
+ url_model_info = gr.Markdown(
508
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
509
+ visible=True,
510
+ )
511
+ url_submit_btn = gr.Button("🔄 Process", variant="primary")
512
+ url_output = gr.JSON(label="API Response")
513
+
514
+ url_model_choice.change(
515
+ fn=lambda model_name: (
516
+ f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}"
517
+ ),
518
+ inputs=url_model_choice,
519
+ outputs=url_model_info,
520
+ )
521
+
522
+ def process_url_for_gradio(image: Optional[Image.Image], url: Optional[str], model_name: str) -> Dict:
523
+ """
524
+ Process image from file or URL for Gradio UI and return JSON response.
525
+
526
+ Args:
527
+ image: Uploaded PIL Image (optional).
528
+ url: Image URL (optional).
529
+ model_name: Model to use for detection.
530
+
531
+ Returns:
532
+ Dictionary with processed image (base64), detected objects, and confidences.
533
+ """
534
+ try:
535
+ if image is None and not url:
536
+ return {"error": "Please provide an image or URL"}
537
+ if image and url:
538
+ return {"error": "Please provide either an image or URL, not both"}
539
+
540
+ if url:
541
+ response = requests.get(url, timeout=10)
542
+ response.raise_for_status()
543
+ image = Image.open(BytesIO(response.content)).convert("RGB")
544
+
545
+ detected_image, objects, scores, unique_objects, unique_scores, _ = process(
546
+ image, model_name
547
+ )
548
+ buffered = BytesIO()
549
+ detected_image.save(buffered, format="PNG")
550
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
551
+ return {
552
+ "image_url": f"data:image/png;base64,{img_base64}",
553
+ "detected_objects": objects,
554
+ "confidence_scores": scores,
555
+ "unique_objects": unique_objects,
556
+ "unique_confidence_scores": unique_scores,
557
+ }
558
+ except requests.RequestException as e:
559
+ error_msg = f"Error fetching image from URL: {str(e)}"
560
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
561
+ return {"error": error_msg}
562
+ except Exception as e:
563
+ error_msg = f"Error processing image: {str(e)}"
564
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
565
+ return {"error": error_msg}
566
+
567
+ url_submit_btn.click(
568
+ fn=process_url_for_gradio,
569
+ inputs=[image_input_json, image_url_input_json, url_model_choice],
570
+ outputs=[url_output],
571
+ )
572
+
573
+ with gr.Tab("ℹ️ Help"):
574
+ gr.Markdown(
575
+ """
576
+ ## How to Use
577
+ - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
578
+ - **JSON Output**: Upload an image or enter a URL, select a model, and click "Process" to get results in JSON format.
579
+ - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
580
+ - **Clear**: Reset all inputs and outputs using the "Clear" button in the Image Upload tab.
581
+ - **Errors**: Check the error box (Image Upload) or JSON response (JSON Output) for issues.
582
 
583
+ ## Tips
584
+ - Use high-quality images for better detection results.
585
+ - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
586
+ - For faster processing, try YOLOS-Tiny on resource-constrained devices.
587
+ """
588
+ )
589
+
590
+ return app
591
+
592
+ except Exception as e:
593
+ logger.error(f"Error creating Gradio UI: {str(e)}\n{traceback.format_exc()}")
594
+ raise RuntimeError(f"Failed to create Gradio UI: {str(e)}")
595
+
596
+ # ------------------------------
597
+ # Launcher
598
+ # ------------------------------
599
+
600
+ def parse_args() -> argparse.Namespace:
601
+ """
602
+ Parse command-line arguments with defaults and ignore unrecognized arguments.
603
+
604
+ Returns:
605
+ Parsed arguments as a Namespace object.
606
+
607
+ Raises:
608
+ SystemExit: If argument parsing fails (handled by argparse).
609
+ """
610
+ parser = argparse.ArgumentParser(
611
+ description="Launcher for Object Detection App with Gradio UI and optional FastAPI server."
612
+ )
613
+ parser.add_argument(
614
+ "--gradio-port",
615
+ type=int,
616
+ default=DEFAULT_GRADIO_PORT,
617
+ help=f"Port for the Gradio UI (default: {DEFAULT_GRADIO_PORT}).",
618
+ )
619
+ parser.add_argument(
620
+ "--enable-fastapi",
621
+ action="store_true",
622
+ default=False,
623
+ help="Enable the FastAPI server (disabled by default).",
624
+ )
625
+ parser.add_argument(
626
+ "--fastapi-port",
627
+ type=int,
628
+ default=DEFAULT_FASTAPI_PORT,
629
+ help=f"Port for the FastAPI server if enabled (default: {DEFAULT_FASTAPI_PORT}).",
630
+ )
631
+
632
+ # Parse known arguments and ignore unrecognized ones (e.g., Jupyter kernel args)
633
+ args, _ = parser.parse_known_args()
634
+ return args
635
+
636
+ def find_available_port(start_port: int, port_range: range, max_attempts: int) -> Optional[int]:
637
+ """
638
+ Find an available port within the specified range.
639
+
640
+ Args:
641
+ start_port: Initial port to try (e.g., from args or environment).
642
+ port_range: Range of ports to attempt.
643
+ max_attempts: Maximum number of ports to try.
644
+
645
+ Returns:
646
+ Available port number, or None if no port is found.
647
+
648
+ Raises:
649
+ OSError: If port binding fails for reasons other than port in use.
650
+ """
651
+ import socket
652
+
653
+ port = start_port
654
+ attempts = 0
655
+
656
+ # Check environment variable GRADIO_SERVER_PORT
657
+ env_port = os.getenv("GRADIO_SERVER_PORT")
658
+ if env_port and env_port.isdigit():
659
+ port = int(env_port)
660
+ logger.info(f"Using GRADIO_SERVER_PORT from environment: {port}")
661
+
662
+ while attempts < max_attempts:
663
+ with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
664
+ try:
665
+ s.bind(("0.0.0.0", port))
666
+ logger.debug(f"Port {port} is available")
667
+ return port
668
+ except OSError as e:
669
+ if e.errno == 98: # Port in use
670
+ logger.debug(f"Port {port} is in use")
671
+ port = port + 1 if port < max(port_range) else min(port_range)
672
+ attempts += 1
673
+ else:
674
+ raise
675
+ except Exception as e:
676
+ logger.error(f"Error checking port {port}: {str(e)}")
677
+ raise
678
+ logger.error(f"No available port found in range {min(port_range)}-{max(port_range)} after {max_attempts} attempts")
679
+ return None
680
+
681
+ def run_fastapi_server(host: str, port: int) -> None:
682
+ """
683
+ Run the FastAPI server using Uvicorn.
684
+
685
+ Args:
686
+ host: Host address for the FastAPI server.
687
+ port: Port for the FastAPI server.
688
+ """
689
+ try:
690
+ uvicorn.run(app, host=host, port=port)
691
+ except Exception as e:
692
+ logger.error(f"Error running FastAPI server: {str(e)}\n{traceback.format_exc()}")
693
+ sys.exit(1)
694
+
695
+ def main() -> None:
696
+ """
697
+ Main function to launch Gradio UI and optional FastAPI server.
698
+
699
+ Raises:
700
+ SystemExit: If the application is interrupted or encounters an error.
701
+ """
702
+ try:
703
+ # Apply nest_asyncio to allow nested event loops in Jupyter/Colab
704
+ nest_asyncio.apply()
705
+
706
+ # Parse command-line arguments
707
+ args = parse_args()
708
+ logger.info(f"Parsed arguments: {args}")
709
+
710
+ # Find available port for Gradio
711
+ gradio_port = find_available_port(args.gradio_port, PORT_RANGE, MAX_PORT_ATTEMPTS)
712
+ if gradio_port is None:
713
+ logger.error("Failed to find an available port for Gradio UI")
714
+ sys.exit(1)
715
+
716
+ # Launch FastAPI server in a separate thread if enabled
717
+ if args.enable_fastapi:
718
+ logger.info(f"Starting FastAPI server on port {args.fastapi_port}")
719
+ fastapi_thread = threading.Thread(
720
+ target=run_fastapi_server,
721
+ args=("0.0.0.0", args.fastapi_port),
722
+ daemon=True
723
+ )
724
+ fastapi_thread.start()
725
+
726
+ # Launch Gradio UI
727
+ logger.info(f"Starting Gradio UI on port {gradio_port}")
728
+ app = create_gradio_ui()
729
+ app.launch(server_port=gradio_port, server_name="0.0.0.0")
730
+
731
+ except KeyboardInterrupt:
732
+ logger.info("Application terminated by user.")
733
+ sys.exit(0)
734
+ except OSError as e:
735
+ logger.error(f"Port binding error: {str(e)}")
736
+ sys.exit(1)
737
+ except Exception as e:
738
+ logger.error(f"Error running application: {str(e)}\n{traceback.format_exc()}")
739
+ sys.exit(1)
740
 
741
  if __name__ == "__main__":
742
+ main()
 
 
hf_space/hf_space/README.md CHANGED
@@ -1,54 +1,56 @@
1
  # 🚀 Object Detection with Transformer Models
2
 
3
- This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
4
 
5
- You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
- The following models are supported, as defined in the application:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
- - `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
13
- - `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
14
- - `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
15
- - `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
- - `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
19
- - `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
- - **Image Upload**: Upload images from your device for object detection via the Gradio interface.
24
- - **URL Input**: Input an image URL for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
- - **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
27
- - **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
28
- - **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
29
- - **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
 
30
 
31
  ## How to Use
32
 
33
- ### 1. **Normal Git Clone Method**
34
 
35
  Follow these steps to set up the application locally:
36
 
37
  #### Prerequisites
38
 
39
  - Python 3.8 or higher
40
- - Install dependencies using `pip`
 
41
 
42
  #### Clone the Repository
43
 
44
  ```bash
45
- git clone https://github.com/NeerajCodz/ObjectDetection.git
46
  cd ObjectDetection
47
  ```
48
 
49
  #### Install Dependencies
50
 
51
- Install the required dependencies from `requirements.txt`:
52
 
53
  ```bash
54
  pip install -r requirements.txt
@@ -56,88 +58,150 @@ pip install -r requirements.txt
56
 
57
  #### Run the Application
58
 
59
- Start the FastAPI server using uvicorn:
60
 
61
  ```bash
62
- uvicorn objectdetection:app --reload
63
  ```
64
 
65
- Alternatively, launch the Gradio interface by running the main script:
66
 
67
  ```bash
68
- python app.py
69
  ```
70
 
71
  #### Access the Application
72
 
73
- - For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
74
- - For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
75
 
76
  ### 2. **Running with Docker**
77
 
78
- If you prefer to use Docker to set up and run the application, follow these steps:
79
 
80
  #### Prerequisites
81
 
82
- - Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
83
 
84
- #### Build the Docker Image
85
 
86
- First, clone the repository (if you haven't already):
87
 
88
  ```bash
89
- git clone https://github.com/NeerajCodz/ObjectDetection.git
90
- cd ObjectDetection
91
  ```
92
 
93
- Now, build the Docker image:
 
 
94
 
95
  ```bash
96
- docker build -t objectdetection:latest .
97
  ```
98
 
99
- #### Run the Docker Container
 
 
100
 
101
- Once the image is built, run the application using this command:
 
 
 
 
 
 
 
 
 
102
 
103
  ```bash
104
- docker run -p 5000:5000 objectdetection:latest
105
  ```
106
 
107
- This will start the application on port 5000. Open your browser and go to `http://localhost:5000` to access the FastAPI interface.
108
 
109
  ### 3. **Demo**
110
 
111
- You can try the demo directly online through Hugging Face's Spaces:
112
 
113
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  ## Using the API
116
 
117
- You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
118
 
119
- **Endpoint**: `/detect`
120
 
121
- **POST**: `/detect`
122
 
123
- **Parameters**:
124
 
125
- - `file`: (optional) Image file (must be of type `image/*`).
126
- - `image_url`: (optional) URL of the image.
127
- - `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
128
 
129
- **Example Request Body**:
130
 
131
- ```json
132
- {
133
- "image_url": "https://example.com/image.jpg",
134
- "model_name": "facebook/detr-resnet-50"
135
- }
136
  ```
137
 
138
- **Response**:
 
 
 
 
 
 
 
 
 
 
 
139
 
140
- The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  ```json
143
  {
@@ -149,14 +213,20 @@ The response includes a base64-encoded image with detections, detected objects,
149
  }
150
  ```
151
 
 
 
 
 
 
 
152
  ## Development Setup
153
 
154
- If you'd like to contribute or modify the application:
155
 
156
  1. Clone the repository:
157
 
158
  ```bash
159
- git clone https://github.com/NeerajCodz/ObjectDetection.git
160
  cd ObjectDetection
161
  ```
162
 
@@ -166,20 +236,37 @@ cd ObjectDetection
166
  pip install -r requirements.txt
167
  ```
168
 
169
- 3. Run the FastAPI server or Gradio interface:
170
 
171
  ```bash
172
- uvicorn objectdetection:app --reload
173
  ```
174
 
175
- or
176
 
177
  ```bash
178
- python app.py
179
  ```
180
 
181
- 4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
182
 
183
  ## Contributing
184
 
185
- Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # 🚀 Object Detection with Transformer Models
2
 
3
+ This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access.
4
 
5
+ Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
 
7
  ## Models Supported
8
 
9
+ The application supports the following models, each tailored for specific detection or segmentation tasks:
10
 
11
  - **DETR (DEtection TRansformer)**:
12
+ - `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone.
13
+ - `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
14
+ - `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues).
15
+ - `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues).
16
 
17
  - **YOLOS (You Only Look One-level Series)**:
18
+ - `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments.
19
+ - `hustvl/yolos-base`: Balances speed and accuracy for object detection.
20
 
21
  ## Features
22
 
23
+ - **Image Upload**: Upload images via the Gradio interface for object detection.
24
+ - **URL Input**: Provide image URLs for detection through the Gradio interface or API.
25
  - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
+ - **Object Detection**: Highlights detected objects with bounding boxes and confidence scores.
27
+ - **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models).
28
+ - **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics.
29
+ - **API Access**: Programmatically process images via the FastAPI `/detect` endpoint.
30
+ - **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab.
31
 
32
  ## How to Use
33
 
34
+ ### 1. **Local Setup (Git Clone)**
35
 
36
  Follow these steps to set up the application locally:
37
 
38
  #### Prerequisites
39
 
40
  - Python 3.8 or higher
41
+ - `pip` for installing dependencies
42
+ - Git for cloning the repository
43
 
44
  #### Clone the Repository
45
 
46
  ```bash
47
+ git clone https://github.com/NeerajCodz/ObjectDetection
48
  cd ObjectDetection
49
  ```
50
 
51
  #### Install Dependencies
52
 
53
+ Install required packages from `requirements.txt`:
54
 
55
  ```bash
56
  pip install -r requirements.txt
 
58
 
59
  #### Run the Application
60
 
61
+ Launch the Gradio interface:
62
 
63
  ```bash
64
+ python app.py
65
  ```
66
 
67
+ To enable the FastAPI server:
68
 
69
  ```bash
70
+ python app.py --enable-fastapi
71
  ```
72
 
73
  #### Access the Application
74
 
75
+ - **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`).
76
+ - **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled).
77
 
78
  ### 2. **Running with Docker**
79
 
80
+ Use Docker for a containerized setup.
81
 
82
  #### Prerequisites
83
 
84
+ - Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started).
85
 
86
+ #### Pull the Docker Image
87
 
88
+ Pull the pre-built image from Docker Hub:
89
 
90
  ```bash
91
+ docker pull neerajcodz/objectdetection:latest
 
92
  ```
93
 
94
+ #### Run the Docker Container
95
+
96
+ Run the application on port 8080:
97
 
98
  ```bash
99
+ docker run -d -p 8080:80 neerajcodz/objectdetection:latest
100
  ```
101
 
102
+ Access the interface at `http://localhost:8080`.
103
+
104
+ #### Build and Run the Docker Image
105
 
106
+ To build the Docker image locally:
107
+
108
+ 1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository).
109
+ 2. Build the image:
110
+
111
+ ```bash
112
+ docker build -t objectdetection:local .
113
+ ```
114
+
115
+ 3. Run the container:
116
 
117
  ```bash
118
+ docker run -d -p 8080:80 objectdetection:local
119
  ```
120
 
121
+ Access the interface at `http://localhost:8080`.
122
 
123
  ### 3. **Demo**
124
 
125
+ Try the demo on Hugging Face Spaces:
126
 
127
  [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
128
 
129
+ ## Command-Line Arguments
130
+
131
+ The `app.py` script supports the following command-line arguments:
132
+
133
+ - `--gradio-port <port>`: Specify the port for the Gradio UI (default: 7860).
134
+ - Example: `python app.py --gradio-port 7870`
135
+ - `--enable-fastapi`: Enable the FastAPI server (disabled by default).
136
+ - Example: `python app.py --enable-fastapi`
137
+ - `--fastapi-port <port>`: Specify the port for the FastAPI server (default: 8000).
138
+ - Example: `python app.py --enable-fastapi --fastapi-port 8001`
139
+
140
+ You can combine arguments:
141
+
142
+ ```bash
143
+ python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001
144
+ ```
145
+
146
+ Alternatively, set the `GRADIO_SERVER_PORT` environment variable:
147
+
148
+ ```bash
149
+ export GRADIO_SERVER_PORT=7870
150
+ python app.py
151
+ ```
152
+
153
  ## Using the API
154
 
155
+ **Note**: The FastAPI API is currently unstable and may require additional configuration for production use.
156
 
157
+ The `/detect` endpoint allows programmatic image processing.
158
 
159
+ ### Running the FastAPI Server
160
 
161
+ Enable FastAPI when launching the script:
162
 
163
+ ```bash
164
+ python app.py --enable-fastapi
165
+ ```
166
 
167
+ Or run FastAPI separately with Uvicorn:
168
 
169
+ ```bash
170
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
 
 
 
171
  ```
172
 
173
+ Access the Swagger UI at `http://localhost:8000/docs` for interactive testing.
174
+
175
+ ### Endpoint Details
176
+
177
+ - **Endpoint**: `POST /detect`
178
+ - **Parameters**:
179
+ - `file`: (optional) Image file (must be `image/*` type).
180
+ - `image_url`: (optional) URL of the image.
181
+ - `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`).
182
+ - **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs.
183
+
184
+ ### Example Requests
185
 
186
+ #### Using `curl` with an Image URL
187
+
188
+ ```bash
189
+ curl -X POST "http://localhost:8000/detect" \
190
+ -H "Content-Type: application/json" \
191
+ -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'
192
+ ```
193
+
194
+ #### Using `curl` with an Image File
195
+
196
+ ```bash
197
+ curl -X POST "http://localhost:8000/detect" \
198
+ -F "file=@/path/to/image.jpg" \
199
+ -F "model_name=facebook/detr-resnet-50"
200
+ ```
201
+
202
+ ### Response Format
203
+
204
+ The response includes a base64-encoded image with detections and detection details:
205
 
206
  ```json
207
  {
 
213
  }
214
  ```
215
 
216
+ ### Notes
217
+
218
+ - Ensure only one of `file` or `image_url` is provided.
219
+ - The API may experience instability with panoptic models; use object detection models for reliability.
220
+ - Test the API using the Swagger UI for easier debugging.
221
+
222
  ## Development Setup
223
 
224
+ To contribute or modify the application:
225
 
226
  1. Clone the repository:
227
 
228
  ```bash
229
+ git clone https://github.com/NeerajCodz/ObjectDetection
230
  cd ObjectDetection
231
  ```
232
 
 
236
  pip install -r requirements.txt
237
  ```
238
 
239
+ 3. Run the application:
240
 
241
  ```bash
242
+ python app.py
243
  ```
244
 
245
+ Or run FastAPI:
246
 
247
  ```bash
248
+ uvicorn objectdetection:app --host 0.0.0.0 --port 8000
249
  ```
250
 
251
+ 4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI).
252
 
253
  ## Contributing
254
 
255
+ Contributions are welcome! To contribute:
256
+
257
+ 1. Fork the repository.
258
+ 2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`).
259
+ 3. Commit changes (`git commit -m "Add your feature"`).
260
+ 4. Push to the branch (`git push origin feature/your-feature`).
261
+ 5. Open a pull request on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
262
+
263
+ Please include tests and documentation for new features. Report issues via GitHub Issues.
264
+
265
+ ## Troubleshooting
266
+
267
+ - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`.
268
+ - **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab.
269
+ - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved.
270
+ - **API Instability**: Test with smaller images and object detection models first.
271
+
272
+ For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
hf_space/hf_space/hf_space/README.md CHANGED
@@ -1,12 +1,185 @@
1
- ---
2
- title: ObjectDetection
3
- emoji: 🦀
4
- colorFrom: green
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 5.29.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Object Detection with Transformer Models
2
+
3
+ This project provides an object detection system using state-of-the-art transformer models, such as **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system can detect objects from uploaded images or image URLs, and it supports different models for detection and segmentation tasks. It includes a Gradio-based web interface and a FastAPI-based API for programmatic access.
4
+
5
+ You can try the demo online on Hugging Face: [Demo Link](https://huggingface.co/spaces/NeerajCodz/ObjectDetection).
6
+
7
+ ## Models Supported
8
+
9
+ The following models are supported, as defined in the application:
10
+
11
+ - **DETR (DEtection TRansformer)**:
12
+ - `facebook/detr-resnet-50`: DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.
13
+ - `facebook/detr-resnet-101`: DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.
14
+ - `facebook/detr-resnet-50-panoptic`(currently has bugs): DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.
15
+ - `facebook/detr-resnet-101-panoptic`(currently has bugs): DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.
16
+
17
+ - **YOLOS (You Only Look One-level Series)**:
18
+ - `hustvl/yolos-tiny`: YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.
19
+ - `hustvl/yolos-base`: YOLOS Base model. Balances speed and accuracy for object detection.
20
+
21
+ ## Features
22
+
23
+ - **Image Upload**: Upload images from your device for object detection via the Gradio interface.
24
+ - **URL Input**: Input an image URL for detection through the Gradio interface or API.
25
+ - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation.
26
+ - **Object Detection**: Detects objects and highlights them with bounding boxes and confidence scores.
27
+ - **Panoptic Segmentation**: Some models (e.g., DETR panoptic variants) support detailed scene segmentation with colored masks.
28
+ - **Image Properties**: Displays image metadata such as format, size, aspect ratio, file size, and color statistics.
29
+ - **API Access**: Use the FastAPI endpoint `/detect` to programmatically process images and retrieve detection results.
30
+
31
+ ## How to Use
32
+
33
+ ### 1. **Normal Git Clone Method**
34
+
35
+ Follow these steps to set up the application locally:
36
+
37
+ #### Prerequisites
38
+
39
+ - Python 3.8 or higher
40
+ - Install dependencies using `pip`
41
+
42
+ #### Clone the Repository
43
+
44
+ ```bash
45
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
46
+ cd ObjectDetection
47
+ ```
48
+
49
+ #### Install Dependencies
50
+
51
+ Install the required dependencies from `requirements.txt`:
52
+
53
+ ```bash
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ #### Run the Application
58
+
59
+ Start the FastAPI server using uvicorn:
60
+
61
+ ```bash
62
+ uvicorn objectdetection:app --reload
63
+ ```
64
+
65
+ Alternatively, launch the Gradio interface by running the main script:
66
+
67
+ ```bash
68
+ python app.py
69
+ ```
70
+
71
+ #### Access the Application
72
+
73
+ - For FastAPI: Open your browser and navigate to `http://localhost:8000` to use the API or view the Swagger UI.
74
+ - For Gradio: The Gradio interface URL will be displayed in the console (typically `http://127.0.0.1:7860`).
75
+
76
+ ### 2. **Running with Docker**
77
+
78
+ If you prefer to use Docker to set up and run the application, follow these steps:
79
+
80
+ #### Prerequisites
81
+
82
+ - Docker installed on your machine. If you don’t have Docker, download and install it from [here](https://www.docker.com/get-started).
83
+
84
+ #### Build the Docker Image
85
+
86
+ First, clone the repository (if you haven't already):
87
+
88
+ ```bash
89
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
90
+ cd ObjectDetection
91
+ ```
92
+
93
+ Now, build the Docker image:
94
+
95
+ ```bash
96
+ docker build -t objectdetection:latest .
97
+ ```
98
+
99
+ #### Run the Docker Container
100
+
101
+ Once the image is built, run the application using this command:
102
+
103
+ ```bash
104
+ docker run -p 5000:5000 objectdetection:latest
105
+ ```
106
+
107
+ This will start the application on port 5000. Open your browser and go to `http://localhost:5000` to access the FastAPI interface.
108
+
109
+ ### 3. **Demo**
110
+
111
+ You can try the demo directly online through Hugging Face's Spaces:
112
+
113
+ [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection)
114
+
115
+ ## Using the API
116
+
117
+ You can interact with the application via the FastAPI `/detect` endpoint to send images and get detection results.
118
+
119
+ **Endpoint**: `/detect`
120
+
121
+ **POST**: `/detect`
122
+
123
+ **Parameters**:
124
+
125
+ - `file`: (optional) Image file (must be of type `image/*`).
126
+ - `image_url`: (optional) URL of the image.
127
+ - `model_name`: (optional) Choose from `facebook/detr-resnet-50`, `hustvl/yolos-tiny`, etc.
128
+
129
+ **Example Request Body**:
130
+
131
+ ```json
132
+ {
133
+ "image_url": "https://example.com/image.jpg",
134
+ "model_name": "facebook/detr-resnet-50"
135
+ }
136
+ ```
137
+
138
+ **Response**:
139
+
140
+ The response includes a base64-encoded image with detections, detected objects, confidence scores, and unique objects with their scores.
141
+
142
+ ```json
143
+ {
144
+ "image_url": "data:image/png;base64,...",
145
+ "detected_objects": ["person", "car"],
146
+ "confidence_scores": [0.95, 0.87],
147
+ "unique_objects": ["person", "car"],
148
+ "unique_confidence_scores": [0.95, 0.87]
149
+ }
150
+ ```
151
+
152
+ ## Development Setup
153
+
154
+ If you'd like to contribute or modify the application:
155
+
156
+ 1. Clone the repository:
157
+
158
+ ```bash
159
+ git clone https://github.com/NeerajCodz/ObjectDetection.git
160
+ cd ObjectDetection
161
+ ```
162
+
163
+ 2. Install dependencies:
164
+
165
+ ```bash
166
+ pip install -r requirements.txt
167
+ ```
168
+
169
+ 3. Run the FastAPI server or Gradio interface:
170
+
171
+ ```bash
172
+ uvicorn objectdetection:app --reload
173
+ ```
174
+
175
+ or
176
+
177
+ ```bash
178
+ python app.py
179
+ ```
180
+
181
+ 4. Open your browser and navigate to `http://localhost:8000` (FastAPI) or the Gradio URL (typically `http://127.0.0.1:7860`).
182
+
183
+ ## Contributing
184
+
185
+ Contributions are welcome! Feel free to open issues or submit pull requests for bug fixes or new features on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.huggingface.yaml ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ sdk: gradio
2
+ python_version: 3.10
3
+ app_file: app.py
4
+ title: Object Detection App
5
+ subtitle: Real-time object detection in images using Gradio
6
+ hardware: cpu-basic
7
+ license: mit
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/docker-build-push.yml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Build and Push Docker Image to Docker Hub
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ jobs:
9
+ build-and-push:
10
+ runs-on: ubuntu-latest
11
+ steps:
12
+ - name: Checkout code
13
+ uses: actions/checkout@v4
14
+
15
+ - name: Log in to Docker Hub
16
+ uses: docker/login-action@v3
17
+ with:
18
+ username: ${{ secrets.DOCKER_USERNAME }}
19
+ password: ${{ secrets.DOCKER_PAT }}
20
+
21
+ - name: Build and push Docker image
22
+ uses: docker/build-push-action@v6
23
+ with:
24
+ context: .
25
+ push: true
26
+ tags: ${{ secrets.DOCKER_USERNAME }}/objectdetection:latest
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/hf-space-sync.yml ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Sync to Hugging Face Space
2
+
3
+ on:
4
+ push:
5
+ branches: [ main ]
6
+
7
+ jobs:
8
+ deploy-to-hf-space:
9
+ runs-on: ubuntu-latest
10
+
11
+ steps:
12
+ - name: Checkout Repository
13
+ uses: actions/checkout@v3
14
+
15
+ - name: Install Git
16
+ run: sudo apt-get install git
17
+
18
+ - name: Push to Hugging Face Space
19
+ env:
20
+ HF_TOKEN: ${{ secrets.HF_TOKEN }}
21
+ HF_USERNAME: ${{ secrets.HF_USERNAME }}
22
+ EMAIL: ${{ secrets.EMAIL }}
23
+ run: |
24
+ git config --global user.email $EMAIL
25
+ git config --global user.name $HF_USERNAME
26
+
27
+ git clone https://$HF_USERNAME:[email protected]/spaces/$HF_USERNAME/ObjectDetection hf_space
28
+ rsync -av --exclude='.git' ./ hf_space/
29
+ cd hf_space
30
+ git add .
31
+ if git diff --cached --quiet; then
32
+ echo "✅ No changes to commit."
33
+ else
34
+ git commit -m "Sync from GitHub"
35
+ git push
36
+ fi
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ __pycache__/
2
+ venv/
3
+ *.pyc
4
+ .DS_Store
5
+ .env
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/Dockerfile ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY requirements.txt .
6
+
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ COPY app.py .
10
+
11
+ EXPOSE 5000
12
+
13
+ CMD ["python", "app.py"]
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Neeraj Sathish Kumar
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py ADDED
@@ -0,0 +1,384 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import torch
3
+ from transformers import DetrImageProcessor, DetrForObjectDetection
4
+ from transformers import YolosImageProcessor, YolosForObjectDetection
5
+ from transformers import DetrForSegmentation
6
+ from PIL import Image, ImageDraw, ImageStat
7
+ import requests
8
+ from io import BytesIO
9
+ import base64
10
+ from collections import Counter
11
+ import logging
12
+ from fastapi import FastAPI, File, UploadFile, HTTPException, Form
13
+ from fastapi.responses import JSONResponse
14
+ import uvicorn
15
+ import pandas as pd
16
+ import traceback
17
+ import os
18
+
19
+ # Set up logging
20
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
21
+ logger = logging.getLogger(__name__)
22
+
23
+ # Constants
24
+ CONFIDENCE_THRESHOLD = 0.5
25
+ VALID_MODELS = [
26
+ "facebook/detr-resnet-50",
27
+ "facebook/detr-resnet-101",
28
+ "facebook/detr-resnet-50-panoptic",
29
+ "facebook/detr-resnet-101-panoptic",
30
+ "hustvl/yolos-tiny",
31
+ "hustvl/yolos-base"
32
+ ]
33
+ MODEL_DESCRIPTIONS = {
34
+ "facebook/detr-resnet-50": "DETR with ResNet-50 backbone for object detection. Fast and accurate for general use.",
35
+ "facebook/detr-resnet-101": "DETR with ResNet-101 backbone for object detection. More accurate but slower than ResNet-50.",
36
+ "facebook/detr-resnet-50-panoptic": "DETR with ResNet-50 for panoptic segmentation. Detects objects and segments scenes.",
37
+ "facebook/detr-resnet-101-panoptic": "DETR with ResNet-101 for panoptic segmentation. High accuracy for complex scenes.",
38
+ "hustvl/yolos-tiny": "YOLOS Tiny model. Lightweight and fast, ideal for resource-constrained environments.",
39
+ "hustvl/yolos-base": "YOLOS Base model. Balances speed and accuracy for object detection."
40
+ }
41
+
42
+ # Lazy model loading
43
+ models = {}
44
+ processors = {}
45
+
46
+ def process(image, model_name):
47
+ """Process an image and return detected image, objects, confidences, unique objects, unique confidences, and properties."""
48
+ try:
49
+ if model_name not in VALID_MODELS:
50
+ raise ValueError(f"Invalid model: {model_name}. Choose from: {VALID_MODELS}")
51
+
52
+ # Load model and processor
53
+ if model_name not in models:
54
+ logger.info(f"Loading model: {model_name}")
55
+ if "yolos" in model_name:
56
+ models[model_name] = YolosForObjectDetection.from_pretrained(model_name)
57
+ processors[model_name] = YolosImageProcessor.from_pretrained(model_name)
58
+ elif "panoptic" in model_name:
59
+ models[model_name] = DetrForSegmentation.from_pretrained(model_name)
60
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
61
+ else:
62
+ models[model_name] = DetrForObjectDetection.from_pretrained(model_name)
63
+ processors[model_name] = DetrImageProcessor.from_pretrained(model_name)
64
+
65
+ model, processor = models[model_name], processors[model_name]
66
+ inputs = processor(images=image, return_tensors="pt")
67
+
68
+ with torch.no_grad():
69
+ outputs = model(**inputs)
70
+
71
+ target_sizes = torch.tensor([image.size[::-1]])
72
+ draw = ImageDraw.Draw(image)
73
+ object_names = []
74
+ confidence_scores = []
75
+ object_counter = Counter()
76
+
77
+ if "panoptic" in model_name:
78
+ processed_sizes = torch.tensor([[inputs["pixel_values"].shape[2], inputs["pixel_values"].shape[3]]])
79
+ results = processor.post_process_panoptic(outputs, target_sizes=target_sizes, processed_sizes=processed_sizes)[0]
80
+
81
+ for segment in results["segments_info"]:
82
+ label = segment["label_id"]
83
+ label_name = model.config.id2label.get(label, "Unknown")
84
+ score = segment.get("score", 1.0)
85
+
86
+ if "masks" in results and segment["id"] < len(results["masks"]):
87
+ mask = results["masks"][segment["id"]].cpu().numpy()
88
+ if mask.shape[0] > 0 and mask.shape[1] > 0:
89
+ mask_image = Image.fromarray((mask * 255).astype("uint8"))
90
+ colored_mask = Image.new("RGBA", image.size, (0, 0, 0, 0))
91
+ mask_draw = ImageDraw.Draw(colored_mask)
92
+ r, g, b = (segment["id"] * 50) % 255, (segment["id"] * 100) % 255, (segment["id"] * 150) % 255
93
+ mask_draw.bitmap((0, 0), mask_image, fill=(r, g, b, 128))
94
+ image = Image.alpha_composite(image.convert("RGBA"), colored_mask).convert("RGB")
95
+ draw = ImageDraw.Draw(image)
96
+
97
+ if score > CONFIDENCE_THRESHOLD:
98
+ object_names.append(label_name)
99
+ confidence_scores.append(float(score))
100
+ object_counter[label_name] = float(score)
101
+ else:
102
+ results = processor.post_process_object_detection(outputs, target_sizes=target_sizes)[0]
103
+
104
+ for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
105
+ if score > CONFIDENCE_THRESHOLD:
106
+ x, y, x2, y2 = box.tolist()
107
+ draw.rectangle([x, y, x2, y2], outline="#32CD32", width=2)
108
+ label_name = model.config.id2label.get(label.item(), "Unknown")
109
+ # Place text at top-right corner, outside the box, with smaller size
110
+ text = f"{label_name}: {score:.2f}"
111
+ text_bbox = draw.textbbox((0, 0), text)
112
+ text_width, text_height = text_bbox[2] - text_bbox[0], text_bbox[3] - text_bbox[1]
113
+ draw.text((x2 - text_width - 2, y - text_height - 2), text, fill="#32CD32")
114
+ object_names.append(label_name)
115
+ confidence_scores.append(float(score))
116
+ object_counter[label_name] = float(score)
117
+
118
+ unique_objects = list(object_counter.keys())
119
+ unique_confidences = [object_counter[obj] for obj in unique_objects]
120
+
121
+ # Image properties
122
+ file_size = "Unknown"
123
+ if hasattr(image, "fp") and image.fp is not None:
124
+ buffered = BytesIO()
125
+ image.save(buffered, format="PNG")
126
+ file_size = f"{len(buffered.getvalue()) / 1024:.2f} KB"
127
+
128
+ # Color statistics
129
+ try:
130
+ stat = ImageStat.Stat(image)
131
+ color_stats = {
132
+ "mean": [f"{m:.2f}" for m in stat.mean],
133
+ "stddev": [f"{s:.2f}" for s in stat.stddev]
134
+ }
135
+ except Exception as e:
136
+ logger.error(f"Error calculating color statistics: {str(e)}")
137
+ color_stats = {"mean": "Error", "stddev": "Error"}
138
+
139
+ properties = {
140
+ "Format": image.format if hasattr(image, "format") and image.format else "Unknown",
141
+ "Size": f"{image.width}x{image.height}",
142
+ "Width": f"{image.width} px",
143
+ "Height": f"{image.height} px",
144
+ "Mode": image.mode,
145
+ "Aspect Ratio": f"{round(image.width / image.height, 2) if image.height != 0 else 'Undefined'}",
146
+ "File Size": file_size,
147
+ "Mean (R,G,B)": ", ".join(color_stats["mean"]) if isinstance(color_stats["mean"], list) else color_stats["mean"],
148
+ "StdDev (R,G,B)": ", ".join(color_stats["stddev"]) if isinstance(color_stats["stddev"], list) else color_stats["stddev"]
149
+ }
150
+
151
+ return image, object_names, confidence_scores, unique_objects, unique_confidences, properties
152
+ except Exception as e:
153
+ logger.error(f"Error in process: {str(e)}\n{traceback.format_exc()}")
154
+ raise
155
+
156
+ # FastAPI Setup
157
+ app = FastAPI(title="Object Detection API")
158
+
159
+ @app.post("/detect")
160
+ async def detect_objects_endpoint(
161
+ file: UploadFile = File(None),
162
+ image_url: str = Form(None),
163
+ model_name: str = Form(VALID_MODELS[0])
164
+ ):
165
+ """FastAPI endpoint to detect objects in an image from file or URL."""
166
+ try:
167
+ if (file is None and not image_url) or (file is not None and image_url):
168
+ raise HTTPException(status_code=400, detail="Provide either an image file or an image URL, but not both.")
169
+
170
+ if file:
171
+ if not file.content_type.startswith("image/"):
172
+ raise HTTPException(status_code=400, detail="File must be an image")
173
+ contents = await file.read()
174
+ image = Image.open(BytesIO(contents)).convert("RGB")
175
+ else:
176
+ response = requests.get(image_url, timeout=10)
177
+ response.raise_for_status()
178
+ image = Image.open(BytesIO(response.content)).convert("RGB")
179
+
180
+ if model_name not in VALID_MODELS:
181
+ raise HTTPException(status_code=400, detail=f"Invalid model. Choose from: {VALID_MODELS}")
182
+
183
+ detected_image, detected_objects, detected_confidences, unique_objects, unique_confidences, _ = process(image, model_name)
184
+
185
+ buffered = BytesIO()
186
+ detected_image.save(buffered, format="PNG")
187
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
188
+ img_url = f"data:image/png;base64,{img_base64}"
189
+
190
+ return JSONResponse(content={
191
+ "image_url": img_url,
192
+ "detected_objects": detected_objects,
193
+ "confidence_scores": detected_confidences,
194
+ "unique_objects": unique_objects,
195
+ "unique_confidence_scores": unique_confidences
196
+ })
197
+ except Exception as e:
198
+ logger.error(f"Error in FastAPI endpoint: {str(e)}\n{traceback.format_exc()}")
199
+ raise HTTPException(status_code=500, detail=f"Error processing image: {str(e)}")
200
+
201
+ # Gradio UI
202
+ def create_gradio_ui():
203
+ with gr.Blocks(theme=gr.themes.Default(primary_hue="blue", secondary_hue="gray")) as demo:
204
+ gr.Markdown(
205
+ """
206
+ # 🚀 Object Detection App
207
+ Upload an image or provide a URL to detect objects using state-of-the-art transformer models (DETR, YOLOS).
208
+ """
209
+ )
210
+
211
+ with gr.Tabs():
212
+ with gr.Tab("📷 Image Upload"):
213
+ with gr.Row():
214
+ with gr.Column(scale=1):
215
+ gr.Markdown("### Input")
216
+ model_choice = gr.Dropdown(
217
+ choices=VALID_MODELS,
218
+ value=VALID_MODELS[0],
219
+ label="🔎 Select Model",
220
+ info="Choose a model for object detection or panoptic segmentation."
221
+ )
222
+ model_info = gr.Markdown(
223
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
224
+ visible=True
225
+ )
226
+ image_input = gr.Image(type="pil", label="📷 Upload Image")
227
+ image_url_input = gr.Textbox(
228
+ label="🔗 Image URL",
229
+ placeholder="https://example.com/image.jpg"
230
+ )
231
+ with gr.Row():
232
+ submit_btn = gr.Button("✨ Detect", variant="primary")
233
+ clear_btn = gr.Button("🗑️ Clear", variant="secondary")
234
+
235
+ model_choice.change(
236
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
237
+ inputs=model_choice,
238
+ outputs=model_info
239
+ )
240
+
241
+ with gr.Column(scale=2):
242
+ gr.Markdown("### Results")
243
+ error_output = gr.Textbox(
244
+ label="⚠️ Errors",
245
+ visible=False,
246
+ lines=3,
247
+ max_lines=5
248
+ )
249
+ output_image = gr.Image(
250
+ type="pil",
251
+ label="🎯 Detected Image",
252
+ interactive=False
253
+ )
254
+ with gr.Row():
255
+ objects_output = gr.DataFrame(
256
+ label="📋 Detected Objects",
257
+ interactive=False,
258
+ value=None
259
+ )
260
+ unique_objects_output = gr.DataFrame(
261
+ label="🔍 Unique Objects",
262
+ interactive=False,
263
+ value=None
264
+ )
265
+ properties_output = gr.DataFrame(
266
+ label="📄 Image Properties",
267
+ interactive=False,
268
+ value=None
269
+ )
270
+
271
+ def process_for_gradio(image, url, model_name):
272
+ try:
273
+ if image is None and not url:
274
+ return None, None, None, None, "Please provide an image or URL"
275
+ if image and url:
276
+ return None, None, None, None, "Please provide either an image or URL, not both"
277
+
278
+ if url:
279
+ response = requests.get(url, timeout=10)
280
+ response.raise_for_status()
281
+ image = Image.open(BytesIO(response.content)).convert("RGB")
282
+
283
+ detected_image, objects, scores, unique_objects, unique_scores, properties = process(image, model_name)
284
+ objects_df = pd.DataFrame({
285
+ "Object": objects,
286
+ "Confidence Score": [f"{score:.2f}" for score in scores]
287
+ }) if objects else pd.DataFrame(columns=["Object", "Confidence Score"])
288
+ unique_objects_df = pd.DataFrame({
289
+ "Unique Object": unique_objects,
290
+ "Confidence Score": [f"{score:.2f}" for score in unique_scores]
291
+ }) if unique_objects else pd.DataFrame(columns=["Unique Object", "Confidence Score"])
292
+ properties_df = pd.DataFrame([properties]) if properties else pd.DataFrame(columns=properties.keys())
293
+ return detected_image, objects_df, unique_objects_df, properties_df, ""
294
+ except Exception as e:
295
+ error_msg = f"Error processing image: {str(e)}"
296
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
297
+ return None, None, None, None, error_msg
298
+
299
+ submit_btn.click(
300
+ fn=process_for_gradio,
301
+ inputs=[image_input, image_url_input, model_choice],
302
+ outputs=[output_image, objects_output, unique_objects_output, properties_output, error_output]
303
+ )
304
+
305
+ clear_btn.click(
306
+ fn=lambda: [None, "", None, None, None, None],
307
+ inputs=None,
308
+ outputs=[image_input, image_url_input, output_image, objects_output, unique_objects_output, properties_output, error_output]
309
+ )
310
+
311
+ with gr.Tab("🔗 URL Input"):
312
+ gr.Markdown("### Process Image from URL")
313
+ image_url_input = gr.Textbox(
314
+ label="🔗 Image URL",
315
+ placeholder="https://example.com/image.jpg"
316
+ )
317
+ url_model_choice = gr.Dropdown(
318
+ choices=VALID_MODELS,
319
+ value=VALID_MODELS[0],
320
+ label="🔎 Select Model"
321
+ )
322
+ url_model_info = gr.Markdown(
323
+ f"**Model Info**: {MODEL_DESCRIPTIONS[VALID_MODELS[0]]}",
324
+ visible=True
325
+ )
326
+ url_submit_btn = gr.Button("🔄 Process URL", variant="primary")
327
+ url_output = gr.JSON(label="API Response")
328
+
329
+ url_model_choice.change(
330
+ fn=lambda model_name: f"**Model Info**: {MODEL_DESCRIPTIONS.get(model_name, 'No description available.')}",
331
+ inputs=url_model_choice,
332
+ outputs=url_model_info
333
+ )
334
+
335
+ def process_url_for_gradio(url, model_name):
336
+ try:
337
+ response = requests.get(url, timeout=10)
338
+ response.raise_for_status()
339
+ image = Image.open(BytesIO(response.content)).convert("RGB")
340
+ detected_image, objects, scores, unique_objects, unique_scores, _ = process(image, model_name)
341
+ buffered = BytesIO()
342
+ detected_image.save(buffered, format="PNG")
343
+ img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")
344
+ return {
345
+ "image_url": f"data:image/png;base64,{img_base64}",
346
+ "detected_objects": objects,
347
+ "confidence_scores": scores,
348
+ "unique_objects": unique_objects,
349
+ "unique_confidence_scores": unique_scores
350
+ }
351
+ except Exception as e:
352
+ error_msg = f"Error processing URL: {str(e)}"
353
+ logger.error(f"{error_msg}\n{traceback.format_exc()}")
354
+ return {"error": error_msg}
355
+
356
+ url_submit_btn.click(
357
+ fn=process_url_for_gradio,
358
+ inputs=[image_url_input, url_model_choice],
359
+ outputs=[url_output]
360
+ )
361
+
362
+ with gr.Tab("ℹ️ Help"):
363
+ gr.Markdown(
364
+ """
365
+ ## How to Use
366
+ - **Image Upload**: Select a model, upload an image or provide a URL, and click "Detect" to see detected objects and image properties.
367
+ - **URL Input**: Enter an image URL, select a model, and click "Process URL" to get results in JSON format.
368
+ - **Models**: Choose from DETR (object detection or panoptic segmentation) or YOLOS (lightweight detection).
369
+ - **Clear**: Reset all inputs and outputs using the "Clear" button.
370
+ - **Errors**: Check the error box for any processing issues.
371
+
372
+ ## Tips
373
+ - Use high-quality images for better detection results.
374
+ - Panoptic models (e.g., DETR-ResNet-50-panoptic) provide segmentation masks for complex scenes.
375
+ - For faster processing, try YOLOS-Tiny on resource-constrained devices.
376
+ """
377
+ )
378
+
379
+ return demo
380
+
381
+ if __name__ == "__main__":
382
+ demo = create_gradio_ui()
383
+ demo.launch()
384
+ # To run FastAPI, use: uvicorn object_detection:app --host 0.0.0.0 --port 8000
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ObjectDetection
3
+ emoji: 🦀
4
+ colorFrom: green
5
+ colorTo: yellow
6
+ sdk: gradio
7
+ sdk_version: 5.29.0
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ transformers
2
+ torch
3
+ tensorflow
4
+ gradio
5
+ pillow
6
+ timm
7
+ fastapi
8
+ requests
requirements.txt CHANGED
@@ -5,4 +5,7 @@ gradio
5
  pillow
6
  timm
7
  fastapi
8
- requests
 
 
 
 
5
  pillow
6
  timm
7
  fastapi
8
+ requests
9
+ uvicorn
10
+ pandas
11
+ nest_asyncio