Spaces:

akhaliq
/

deeplab2

Runtime error

App Files Files Community

deeplab2 / g3doc /setup /your_own_dataset.md

akhaliq3

spaces demo

506da10 almost 4 years ago

preview code

raw

history blame contribute delete

2.78 kB

	# Convert your own dataset for DeepLab2 framework

	You may want to train DeepLab2 on your own dataset. Here, we provide some
	guidances and hopefully that will facillitate the preparation process.

	1. Prepare your own dataset.
	* Images should be stored either in `jpg` or `png` format.
	* Annotations should be stored either in `png` or `raw` format. The
	DeepLab2 framework assumes the panoptic label format (i.e.,
	`panoptic_label = semantic_label * label_divisor + instance_id`, where the
	`label_divisor` should be larger than the maximum number of instances per
	image). The `raw` format refers to the case where we could save the panoptic
	annotations in the int32 array (e.g., int32_numpy_array.tostring()).
	2. Convert the dataset to TFRecord.
	* Update our provided example code (e.g.,
	[build_step_data.py](../../data/build_step_data.py))to convert your dataset
	to TFRecord.
	3. Modify the `dataset.py` (path: `${DEEPLAB2}/data/dataset.py`) to provide
	your dataset information.
	* Set the `panoptic_label_divisor` (i.e., the `label_divisor` above)
	correctly. Its value should be larger than the maximum number of instances
	that could appear per image in your dataset.
	* Set the `ignore_label` properly. Pixels annotated with `ignore_label`
	are not used during both training and evaluation. If your dataset does not
	contain the `ignore_label` annotations, you could simply set it to be a
	large value (e.g., 255 as for
	[Cityscapes](https://www.cityscapes-dataset.com/)).
	* Set the `class_has_instance_list` properly. The variable specifies
	which class belongs to the `thing` class (i.e., countable objects such as
	people, cars).
	* Set the colormap (for visualization) properly. You may also need to
	define your own colormap (see `${DEEPLAB2}/trainer/vis_utils.py`).
	4. Prepare the experiment config.
	* Update our provided example configs (path:
	`${DEEPLAB2}/configs/${DATASET}/${MODEL}/${BACKBONE}`) for your use
	case. A few things that may worth your attention:
	* Set the `crop_size` correctly for both training and evaluation. See
	Q2 in [FAQ](../faq.md) for more details.
	* Tune the config flags for your dataset (e.g., `base_learning_rate`,
	`training_number_of_step`, and so on).

	Finally, if your dataset only contains semantic segmentation annotations,
	you could still use DeepLab2 framework with some minor changes:

	1. Set `panoptic_label_divisor=None` in dataset.py (we also provide one
	example in dataset.py, where we only train with semantic segmentation on
	Cityscapes).
	2. Have a config similar to
	`${DEEPLAB2}/configs/cityscapes/panoptic_deeplab/
	resnet50_os32_semseg.textproto`, where the instance branch is not
	initiated.

	At this point, you are good to go! Enjoy training DeepLab2!