ObjectDetection / README.md
NeerajCodz's picture
Sync from GitHub
38bc350

A newer version of the Gradio SDK is available: 5.29.0

Upgrade

πŸš€ Object Detection with Transformer Models

This project provides a robust object detection system leveraging state-of-the-art transformer models, including DETR (DEtection TRansformer) and YOLOS (You Only Look One-level Series). The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly Gradio web interface for interactive use and a FastAPI endpoint for programmatic access.

Try the online demo on Hugging Face Spaces: Object Detection Demo.

Models Supported

The application supports the following models, each tailored for specific detection or segmentation tasks:

  • DETR (DEtection TRansformer):

    • facebook/detr-resnet-50: Fast and accurate object detection with a ResNet-50 backbone.
    • facebook/detr-resnet-101: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.
    • facebook/detr-resnet-50-panoptic: Panoptic segmentation with ResNet-50 (note: may have stability issues).
    • facebook/detr-resnet-101-panoptic: Panoptic segmentation with ResNet-101 (note: may have stability issues).
  • YOLOS (You Only Look One-level Series):

    • hustvl/yolos-tiny: Lightweight and fast, ideal for resource-constrained environments.
    • hustvl/yolos-base: Balances speed and accuracy for object detection.

Features

  • Image Upload: Upload images via the Gradio interface for object detection.
  • URL Input: Provide image URLs for detection through the Gradio interface or API.
  • Model Selection: Choose between DETR and YOLOS models for detection or panoptic segmentation.
  • Object Detection: Highlights detected objects with bounding boxes and confidence scores.
  • Panoptic Segmentation: Supports scene segmentation with colored masks (DETR panoptic models).
  • Image Properties: Displays metadata like format, size, aspect ratio, file size, and color statistics.
  • API Access: Programmatically process images via the FastAPI /detect endpoint.
  • Flexible Deployment: Run locally, in Docker, or in cloud environments like Google Colab.

How to Use

1. Local Setup (Git Clone)

Follow these steps to set up the application locally:

Prerequisites

  • Python 3.8 or higher
  • pip for installing dependencies
  • Git for cloning the repository

Clone the Repository

git clone https://github.com/NeerajCodz/ObjectDetection
cd ObjectDetection

Install Dependencies

Install required packages from requirements.txt:

pip install -r requirements.txt

Run the Application

Launch the Gradio interface:

python app.py

To enable the FastAPI server:

python app.py --enable-fastapi

Access the Application

  • Gradio: Open the URL displayed in the console (typically http://127.0.0.1:7860).
  • FastAPI: Navigate to http://localhost:8000 for the API or Swagger UI (if enabled).

2. Running with Docker

Use Docker for a containerized setup.

Prerequisites

Pull the Docker Image

Pull the pre-built image from Docker Hub:

docker pull neerajcodz/objectdetection:latest

Run the Docker Container

Run the application on port 8080:

docker run -d -p 8080:80 neerajcodz/objectdetection:latest

Access the interface at http://localhost:8080.

Build and Run the Docker Image

To build the Docker image locally:

  1. Ensure you have a Dockerfile in the repository root (example provided in the repository).
  2. Build the image:
docker build -t objectdetection:local .
  1. Run the container:
docker run -d -p 8080:80 objectdetection:local

Access the interface at http://localhost:8080.

3. Demo

Try the demo on Hugging Face Spaces:

Object Detection Demo

Command-Line Arguments

The app.py script supports the following command-line arguments:

  • --gradio-port <port>: Specify the port for the Gradio UI (default: 7860).
    • Example: python app.py --gradio-port 7870
  • --enable-fastapi: Enable the FastAPI server (disabled by default).
    • Example: python app.py --enable-fastapi
  • --fastapi-port <port>: Specify the port for the FastAPI server (default: 8000).
    • Example: python app.py --enable-fastapi --fastapi-port 8001
  • --confidence-threshold <float-value): Confidence threshold for detection (Range: 0 - 1) (default: 0.5).
    • Example: python app.py --confidence-threshold 0.75

You can combine arguments:

python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75

Alternatively, set the GRADIO_SERVER_PORT environment variable:

export GRADIO_SERVER_PORT=7870
python app.py

Using the API

Note: The FastAPI API is currently unstable and may require additional configuration for production use.

The /detect endpoint allows programmatic image processing.

Running the FastAPI Server

Enable FastAPI when launching the script:

python app.py --enable-fastapi

Or run FastAPI separately with Uvicorn:

uvicorn objectdetection:app --host 0.0.0.0 --port 8000

Access the Swagger UI at http://localhost:8000/docs for interactive testing.

Endpoint Details

  • Endpoint: POST /detect
  • Parameters:
    • file: (optional) Image file (must be image/* type).
    • image_url: (optional) URL of the image.
    • model_name: (optional) Model name (e.g., facebook/detr-resnet-50, hustvl/yolos-tiny).
  • Content-Type: multipart/form-data for file uploads, application/json for URL inputs.

Example Requests

Using curl with an Image URL

curl -X POST "http://localhost:8000/detect" \\
  -H "Content-Type: application/json" \\
  -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'

Using curl with an Image File

curl -X POST "http://localhost:8000/detect" \\
  -F "file=@/path/to/image.jpg" \\
  -F "model_name=facebook/detr-resnet-50"

Response Format

The response includes a base64-encoded image with detections and detection details:

{
  "image_url": "data:image/png;base64,...",
  "detected_objects": ["person", "car"],
  "confidence_scores": [0.95, 0.87],
  "unique_objects": ["person", "car"],
  "unique_confidence_scores": [0.95, 0.87]
}

Notes

  • Ensure only one of file or image_url is provided.
  • The API may experience instability with panoptic models; use object detection models for reliability.
  • Test the API using the Swagger UI for easier debugging.

Development Setup

To contribute or modify the application:

  1. Clone the repository:
git clone https://github.com/NeerajCodz/ObjectDetection
cd ObjectDetection
  1. Install dependencies:
pip install -r requirements.txt
  1. Run the application:
python app.py

Or run FastAPI:

uvicorn objectdetection:app --host 0.0.0.0 --port 8000
  1. Access at http://localhost:7860 (Gradio) or http://localhost:8000 (FastAPI).

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository.
  2. Create a feature or bugfix branch (git checkout -b feature/your-feature).
  3. Commit changes (git commit -m "Add your feature").
  4. Push to the branch (git push origin feature/your-feature).
  5. Open a pull request on the GitHub repository.

Please include tests and documentation for new features. Report issues via GitHub Issues.

Troubleshooting

  • Port Conflicts: If port 7860 is in use, specify a different port with --gradio-port or set GRADIO_SERVER_PORT.
    • Example: python app.py --gradio-port 7870
  • Colab Asyncio Error: If you encounter RuntimeError: asyncio.run() cannot be called from a running event loop in Colab, the application now uses nest_asyncio to handle this. Ensure nest_asyncio is installed (pip install nest_asyncio).
  • Panoptic Model Bugs: Avoid detr-resnet-*-panoptic models until stability issues are resolved.
  • API Instability: Test with smaller images and object detection models first.
  • FastAPI Not Starting: Ensure --enable-fastapi is used, and check that the specified --fastapi-port (default: 8000) is available.

For further assistance, open an issue on the GitHub repository.