Spaces:

karolmajek
/

Axial-DeepLab-SWideRNet

Runtime error

App Files Files Community

Axial-DeepLab-SWideRNet / g3doc /setup /kitti_step.md

karolmajek

from https://huggingface.co/spaces/akhaliq/deeplab2

d1843be over 3 years ago

preview code

raw

history blame contribute delete

5.16 kB

	# Run DeepLab2 on KITTI-STEP dataset

	## KITTI-STEP dataset

	KITTI-STEP extends the existing
	[KITTI-MOTS](http://www.cvlibs.net/datasets/kitti/eval_mots.php) dataset with
	spatially and temporally dense annotations. KITTI-STEP dataset provides a
	test-bed for studying long-term pixel-precise segmentation and tracking under
	real-world conditions.

	### Annotation

	KITTI-STEP's annotation is collected in a semi-automatic manner. At the first
	stage, the pseudo semantic labels generated by the state-of-the-art
	[Panoptic-DeepLab](../projects/panoptic_deeplab.md) are refined by human
	annotators with at least one round. Then this new semantic segmentation
	annotation is merged with the existing tracking instance ground-truth from the
	[KITTI-MOTS](http://www.cvlibs.net/datasets/kitti/eval_mots.php). Please refer
	to the following figure as an overview.

	<p align="center">
	<img src="../img/step/kitti_step_annotation.png" width=500>
	</p>

	### Label Map

	KITTI-STEP adopts the same 19 classes as defined in
	[Cityscapes](https://www.cityscapes-dataset.com/dataset-overview/#class-definitions)
	with `pedestrians` and `cars` carefully annotated with track IDs. More
	specifically, KITTI-STEP has the following label to index mapping:

	Label Name \| Label ID
	-------------- \| --------
	road \| 0
	sidewalk \| 1
	building \| 2
	wall \| 3
	fence \| 4
	pole \| 5
	traffic light \| 6
	traffic sign \| 7
	vegetation \| 8
	terrain \| 9
	sky \| 10
	person&dagger; \| 11
	rider \| 12
	car&dagger; \| 13
	truck \| 14
	bus \| 15
	train \| 16
	motorcycle \| 17
	bicycle \| 18
	void \| 255

	&dagger;: Single instance annotations are available.

	### Prepare KITTI-STEP for Training and Evaluation

	#### Download data

	KITTI-STEP has the same train and test sequences as
	[KITTI-MOTS](http://www.cvlibs.net/datasets/kitti/eval_mots.php) (with 21 and 29
	sequences for training and testing, respectively). Similarly, the training
	sequences are further split into training set (12 sequences) and validation set
	(9 sequences).

	In the following, we provide a step-by-step walk through to prepare the data.

	1. Create the KITTI-STEP directory: `bash mkdir ${KITTI_STEP_ROOT}/images cd
	${KITTI_STEP_ROOT}/images`

	2. Download KITTI images from their
	[website](http://www.cvlibs.net/datasets/kitti/index.php) and unzip. `bash
	wget ${KITTI_LINK} unzip ${KITTI_IMAGES}.zip`

	3. To prepare the dataset for our scripts, we need to move and rename some
	directories:

	```bash
	mv testing/image_02/ test/
	rm -r testing/

	# Move all validation sequences:
	mkdir val
	mv training/image_02/0002 val/
	mv training/image_02/0006 val/
	mv training/image_02/0007 val/
	mv training/image_02/0008 val/
	mv training/image_02/0010 val/
	mv training/image_02/0013 val/
	mv training/image_02/0014 val/
	mv training/image_02/0016 val/
	mv training/image_02/0018 val/

	# Move training sequences
	mv training/image_02/ train/
	rm -r training
	```

	4. Download groundtruth KITTI-STEP panoptic maps from
	[here](http://storage.googleapis.com/gresearch/tf-deeplab/data/kitti-step.tar.gz).

	```bash
	# Goto ${KITTI_STEP_ROOT}
	cd ..

	wget http://storage.googleapis.com/gresearch/tf-deeplab/data/kitti-step.tar.gz
	tar -xvf kitti-step.tar.gz
	mv kitti-step/panoptic_maps panoptic_maps
	rm -r kitti-step
	```

	The groundtruth panoptic map is encoded as follows in PNG format:

	```
	R = semantic_id
	G = instance_id // 256
	B = instance % 256
	```

	Following the above guide, your data structure should look like this:

	```
	.(KITTI_STEP_ROOT)
	+-- images
	\| \|
	\| +-- train
	\| \| \|
	\| \| +-- sequence_id (%04d)
	\| \| \|
	\| \| +-- frame_id.png (%06d.png)
	\| \| ...
	\| +-- val
	\| +-- test
	\|
	+-- panoptic_maps
	\|
	+-- train
	\| \|
	\| +-- sequence_id (%04d)
	\| \|
	\| +-- frame_id.png (%06d.png)
	\| ...
	+-- val
	```

	#### Create tfrecord files

	To create dataset for training and evaluation, run the following command:

	```bash
	python deeplab2/data/build_step_data.py \
	--step_root=${KITTI_STEP_ROOT} \
	--output_dir=${OUTPUT_DIR}
	```

	This script outputs three sharded tfrecord files:
	`{train\|val\|test}@10.tfrecord`. In the tfrecords, for `train` and `val` set, it
	contains the RGB image pixels as well as their panoptic maps. For `test` set, it
	contains RGB images only. These files will be used as the input for the model
	training and evaluation.

	Optionally, you can also specify with `--use_two_frames` to encode two
	consecutive frames into the tfrecord files.

	## Citing KITTI-STEP

	If you find this dataset helpful in your research, please use the following
	BibTeX entry.

	```
	@article{step_2021,
	author={Mark Weber and Jun Xie and Maxwell Collins and Yukun Zhu and Paul Voigtlaender and Hartwig Adam and Bradley Green and Andreas Geiger and Bastian Leibe and Daniel Cremers and Aljosa Osep and Laura Leal-Taixe and Liang-Chieh Chen},
	title={{STEP}: Segmenting and Tracking Every Pixel},
	journal={arXiv:2102.11859},
	year={2021}
	}
	```