deeplab2 / g3doc /projects /motion_deeplab.md
akhaliq3
spaces demo
506da10

A newer version of the Gradio SDK is available: 5.28.0

Upgrade

TODO: Prepare model zoo and some model introduction.

References below are really meant for reference when writing the doc. Please remove the references once ready.

References:

Motion-DeepLab

TODO: Add model introduction and maybe a figure. Motion-DeepLab is xxxxx

Prerequisite

  1. Make sure the software is properly installed.

  2. Make sure the target dataset is correctly prepared (e.g., KITTI-STEP).

  3. Download the Cityscapes pretrained checkpoints listed below, and update the initial_checkpoint path in the config files.

Model Zoo

KITTI-STEP Video Panoptic Segmentation

Initial checkpoint: We provide several Cityscapes pretrained checkpoints for KITTI-STEP experiments. Please download them and update the initial_checkpoint path in the config files.

Model Download Note
Panoptic-DeepLab initial_checkpoint The initial checkpoint for single-frame baseline.
Motion-DeepLab initial_checkpoint The initial checkpoint for two-frame baseline.

We also provide checkpoints pretrained on KITTI-STEP below. If you would like to train those models by yourself, please find the corresponding config files under the directories configs/kitti/panoptic_deeplab (single-frame-baseline) or configs/kitti/motion_deeplab (two-frame-baseline).

Panoptic-DeepLab (single-frame-baseline):

Backbone Output stride Dataset split PQ† APMask mIoU
ResNet-50 (config, ckpt) 32 KITTI-STEP train set 48.31 42.22 71.16
ResNet-50 (config, ckpt) 32 KITTI-STEP trainval set - - -

†: See Q4 in FAQ.

This single-frame baseline could be used together with other state-of-the-art optical flow methods (e.g., RAFT [1]) for propagating mask predictions from one frame to another, as shown in our STEP paper.

Motion-DeepLab (two-frame-baseline):

Backbone Output stride Dataset split PQ† APMask mIoU STQ
ResNet-50 (config, ckpt) 32 KITTI-STEP train set 42.08 37.52 63.15 57.7
ResNet-50 (config, ckpt) 32 KITTI-STEP trainval set - - - -

†: See Q4 in FAQ.

MOTChallenge-STEP Video Panoptic Segmentation

Initial checkpoint: We provide several Cityscapes pretrained checkpoints for MOTChallenge-STEP experiments. Please download them and update the initial_checkpoint path in the config files.

Model Download Note
Panoptic-DeepLab initial_checkpoint The initial checkpoint for single-frame baseline.
Motion-DeepLab initial_checkpoint The initial checkpoint for two-frame baseline.

We also provide checkpoints pretrained on MOTChallenge-STEP below. If you would like to train those models by yourself, please find the corresponding config files under the directories for configs/motchallenge/panoptic_deeplab (single-frame-baseline) or configs/motchallenge/motion_deeplab (two-frame-baseline).

Panoptic-DeepLab (single-frame-baseline):

TODO: Add pretrained checkpoint.

Backbone Output stride Dataset split PQ† APMask mIoU
ResNet-50 (config) 32 MOTChallenge-STEP train set ? ? ?
ResNet-50 32 MOTChallenge-STEP trainval set - - -

†: See Q4 in FAQ.

This single-frame baseline could be used together with other state-of-the-art optical flow methods (e.g., RAFT [1]) for propagating mask predictions from one frame to another, as shown in our STEP paper.

Motion-DeepLab (two-frame-baseline):

TODO: Add pretrained checkpoint.

Backbone Output stride Dataset split PQ† APMask mIoU STQ
ResNet-50 (config) 32 MOTChallenge-STEP train set ? ? ? ?
ResNet-50 32 MOTChallenge-STEP trainval set - - - -

†: See Q4 in FAQ.

Citing Motion-DeepLab

If you find this code helpful in your research or wish to refer to the baseline results, please use the following BibTeX entry.

  • STEP (Motion-DeepLab):
@article{step_2021,
  author={Mark Weber and Jun Xie and Maxwell Collins and Yukun Zhu and Paul Voigtlaender and Hartwig Adam and Bradley Green and Andreas Geiger and Bastian Leibe and Daniel Cremers and Aljosa Osep and Laura Leal-Taixe and Liang-Chieh Chen},
  title={{STEP}: Segmenting and Tracking Every Pixel},
  journal={arXiv:2102.11859},
  year={2021}
}

References

  1. Zachary Teed and Jia Deng. RAFT: recurrent all-pairs field transforms for optical flow. In ECCV, 2020