Spaces:
Sleeping
Sleeping
title: Objectlocalization | |
emoji: π | |
colorFrom: blue | |
colorTo: green | |
sdk: gradio | |
sdk_version: 5.18.0 | |
app_file: src/app.py | |
pinned: false | |
short_description: Using RESTNET-RCNN-RPN-FNN to detect lego pieces | |
# LEGO Object Detection using Faster R-CNN | |
[Faster R-CNN](https://pytorch.org/vision/main/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn.html) | |
This project trains a **Faster R-CNN** model with a **ResNet-50 backbone** to detect LEGO objects using a custom dataset. | |
--- | |
## π Project Overview | |
This project implements an advanced object detection system specifically designed for LEGO pieces using a combination of powerful deep learning architectures: | |
1. **ResNet-50 Backbone**: | |
- Serves as the feature extractor | |
- Pre-trained on ImageNet for robust feature learning | |
- Deep residual learning framework for improved training of deep networks | |
2. **Region Proposal Network (RPN)**: | |
- Scans the image and proposes potential object regions | |
- Generates anchor boxes of various scales and ratios | |
- Outputs "objectness" scores and bounding box refinements | |
3. **Fast Neural Network (FNN)**: | |
- Performs final classification and bounding box regression | |
- Takes features from proposed regions | |
- Outputs class probabilities and precise box coordinates | |
### Key Features | |
- **End-to-End Training**: The entire network is trained jointly for optimal performance | |
- **Multi-Scale Detection**: Capable of detecting LEGO pieces of varying sizes | |
- **Real-Time Processing**: Efficient architecture allows for quick inference | |
- **High Accuracy**: Achieves strong mean Average Precision (mAP) on LEGO detection | |
## **Project Structure** | |
```yaml | |
lego_detection/ | |
βββ models/ # Trained models | |
β βββ lego_fasterrcnn.pth # Saved model | |
β βββ faster_rcnn_custom.pth # Latest model | |
β | |
βββ datasets/ # Dataset folder | |
β βββ images/ # Training images | |
β βββ annotations/ # Corresponding XML annotations | |
β βββ test_images/ # Testing the model | |
β βββ annotations.json/ # To format annotation in one only file | |
β | |
βββ src/ # Source code | |
β βββ transformdata.py # Formats the data to COCO.json | |
β βββ new_trainer.py # Train the model based on the new assumptions | |
β βββ app.py # Allow users to interact with this model | |
β βββ Attempt1 # First Implementation | |
β βββ dataset.py # Dataset class (LegoDataset) | |
β βββ train.py # Training script | |
β βββ evaluate.py # mAP Calculation | |
β βββ utils.py # IoU, AP calculation functions | |
β | |
βββ config.yaml # Hyperparameters & settings | |
βββ README.md # Project documentation | |
``` | |
--- | |
## β‘ **Setup Instructions** | |
### **1οΈβ£ Install Dependencies** | |
```bash | |
pip install -r requirements.txt | |
``` | |
### **2οΈβ£ Update Configuration** | |
Modify **`config.yaml`** to adjust **hyperparameters, dataset paths, and model settings**. | |
--- | |
## **visualize using Gradio** | |
1) If the model is not in models please (add it from the submitted file) Im trying to add the model but its too big for github standars. | |
2) Run the following Bash | |
```bash | |
python src/app.py | |
``` | |
3) Evaluate and give me 100. I know, im awesome. | |
--- | |
## π **Training the Model** | |
Run the following command to start training: | |
```bash | |
python src/train.py | |
``` | |
This script will: | |
β Train Faster R-CNN with **LegoDataset** | |
β Log training **loss & mAP** | |
β Save the trained model in `models/lego_fasterrcnn.pth` | |
--- | |
## π **Monitoring Training Progress** | |
Use the Jupyter Notebook to **visualize loss & mAP over epochs**: | |
```bash | |
jupyter notebook notebooks/training_visualization.ipynb | |
``` | |
--- | |
## π οΈ **Hyperparameters (`config.yaml`)** | |
Modify the **`config.yaml`** file to fine-tune the model: | |
```yaml | |
model: | |
backbone: resnet50 | |
num_classes: 2 | |
pretrained: true | |
learning_rate: 0.0001 | |
epochs: 5 | |
batch_size: 8 | |
optimizer: adam | |
dataset: | |
image_dir: datasets/images | |
annotation_dir: datasets/annotations | |
train_split: 0.8 | |
val_split: 0.2 | |
evaluation: | |
iou_threshold: 0.5 | |
``` | |
--- | |
## π **Training Strategies for Faster R-CNN with ResNet-50 Backbone** | |
| Trainable Backbone Layers | Epochs | Batch Size | Recommended Learning Rate | Optimizer | Scheduler | | |
|--------------------------|--------|-----------|--------------------------|-----------|------------------| | |
| 0 | 10 | 4 | 0.0100 | SGD | StepLR(3, 0.1) | | |
| 3 | 10 | 8 | 0.0050 | SGD | StepLR(3, 0.1) | | |
| 5 | 10 | 16 | 0.0001 | AdamW | CosineAnnealing | | |
| 3 | 20 | 8 | 0.0050 | SGD | StepLR(5, 0.1) | | |
| 5 | 20 | 16 | 0.0001 | AdamW | CosineAnnealing | | |
--- | |
## π‘ **Evaluating the Model** | |
Once training is complete, evaluate performance using: | |
```bash | |
python src/evaluate.py | |
``` | |
--- | |
## π‘ **Troubleshooting & Tips** | |
### β **Training Takes Too Long?** | |
- Reduce `epochs` in `config.yaml` | |
- Use a **smaller dataset** for testing | |
### β **mAP is too low?** | |
- Increase `epochs` | |
- Check dataset annotations | |
- Tune learning rate | |
--- | |
## π **Contributors** | |
- π€ **Alex** - Machine Learning Engineer | |
π **Happy Training!** | |