Post
5140
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers ๐ฅ
D-FINE is the sota real-time object detector that runs on T4 (free Colab) ๐คฉ
> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352
Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper ๐ฉ
Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve ๐ฅฒโน๏ธ
D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate ๐คฉ
Another core idea behind this model is Global Optimal Localization Self-Distillation โคต๏ธ
this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.
D-FINE is the sota real-time object detector that runs on T4 (free Colab) ๐คฉ
> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352
Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper ๐ฉ
Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve ๐ฅฒโน๏ธ
D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate ๐คฉ
Another core idea behind this model is Global Optimal Localization Self-Distillation โคต๏ธ
this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.