TODO: Prepare model zoo and some model introduction.

References below are really meant for reference when writing the doc. Please remove the references once ready.

References:

Motion-DeepLab

TODO: Add model introduction and maybe a figure. Motion-DeepLab is xxxxx

Prerequisite

Make sure the software is properly installed.
Make sure the target dataset is correctly prepared (e.g., KITTI-STEP).
Download the Cityscapes pretrained checkpoints listed below, and update the initial_checkpoint path in the config files.

Model Zoo

KITTI-STEP Video Panoptic Segmentation

Initial checkpoint: We provide several Cityscapes pretrained checkpoints for KITTI-STEP experiments. Please download them and update the initial_checkpoint path in the config files.

Model	Download	Note
Panoptic-DeepLab	initial_checkpoint	The initial checkpoint for single-frame baseline.
Motion-DeepLab	initial_checkpoint	The initial checkpoint for two-frame baseline.

We also provide checkpoints pretrained on KITTI-STEP below. If you would like to train those models by yourself, please find the corresponding config files under the directories configs/kitti/panoptic_deeplab (single-frame-baseline) or configs/kitti/motion_deeplab (two-frame-baseline).

Panoptic-DeepLab (single-frame-baseline):

Backbone	Output stride	Dataset split	PQ†	AP^Mask†	mIoU
ResNet-50 (config, ckpt)	32	KITTI-STEP train set	48.31	42.22	71.16
ResNet-50 (config, ckpt)	32	KITTI-STEP trainval set	-	-	-

†: See Q4 in FAQ.

This single-frame baseline could be used together with other state-of-the-art optical flow methods (e.g., RAFT [1]) for propagating mask predictions from one frame to another, as shown in our STEP paper.

Motion-DeepLab (two-frame-baseline):

Backbone	Output stride	Dataset split	PQ†	AP^Mask†	mIoU	STQ
ResNet-50 (config, ckpt)	32	KITTI-STEP train set	42.08	37.52	63.15	57.7
ResNet-50 (config, ckpt)	32	KITTI-STEP trainval set	-	-	-	-

†: See Q4 in FAQ.

MOTChallenge-STEP Video Panoptic Segmentation

Initial checkpoint: We provide several Cityscapes pretrained checkpoints for MOTChallenge-STEP experiments. Please download them and update the initial_checkpoint path in the config files.

Model	Download	Note
Panoptic-DeepLab	initial_checkpoint	The initial checkpoint for single-frame baseline.
Motion-DeepLab	initial_checkpoint	The initial checkpoint for two-frame baseline.

We also provide checkpoints pretrained on MOTChallenge-STEP below. If you would like to train those models by yourself, please find the corresponding config files under the directories for configs/motchallenge/panoptic_deeplab (single-frame-baseline) or configs/motchallenge/motion_deeplab (two-frame-baseline).

Panoptic-DeepLab (single-frame-baseline):

TODO: Add pretrained checkpoint.

Backbone	Output stride	Dataset split	PQ†	AP^Mask†	mIoU
ResNet-50 (config)	32	MOTChallenge-STEP train set	?	?	?
ResNet-50	32	MOTChallenge-STEP trainval set	-	-	-

†: See Q4 in FAQ.

Motion-DeepLab (two-frame-baseline):

TODO: Add pretrained checkpoint.

Backbone	Output stride	Dataset split	PQ†	AP^Mask†	mIoU	STQ
ResNet-50 (config)	32	MOTChallenge-STEP train set	?	?	?	?
ResNet-50	32	MOTChallenge-STEP trainval set	-	-	-	-

†: See Q4 in FAQ.

Citing Motion-DeepLab

If you find this code helpful in your research or wish to refer to the baseline results, please use the following BibTeX entry.

STEP (Motion-DeepLab):

@article{step_2021,
  author={Mark Weber and Jun Xie and Maxwell Collins and Yukun Zhu and Paul Voigtlaender and Hartwig Adam and Bradley Green and Andreas Geiger and Bastian Leibe and Daniel Cremers and Aljosa Osep and Laura Leal-Taixe and Liang-Chieh Chen},
  title={{STEP}: Segmenting and Tracking Every Pixel},
  journal={arXiv:2102.11859},
  year={2021}
}

References

Zachary Teed and Jia Deng. RAFT: recurrent all-pairs field transforms for optical flow. In ECCV, 2020