Spaces:
Runtime error
Runtime error
# FAQ | |
________________________________________________________________________________ | |
**Q1: What should I do if I encounter OOM (out-of-memory) while training the | |
models?** | |
**A1**: To avoid OOM, you could try: | |
1. reducing the training crop size (i.e., the flag `crop_size` in | |
`train_dataset_options`, and see Q2 for more details), which reduces the | |
input size during training, | |
2. using a larger output stride (e.g., 32) in the backbone (i.e., the flag | |
`output_stride` in `model_options`, and see Q3 for more details), which | |
reduces the usage of atrous convolution, | |
3. using a smaller backbone, such as ResNet-50. | |
________________________________________________________________________________ | |
**Q2: What is the `crop_size` I need to set?** | |
**A2**: DeepLab framework always uses `crop_size` equal to `output_stride` * k + | |
1, where k is an integer. | |
* During inference/evaluation, since DeepLab framework uses whole-image | |
inference, we need to set k so that the resulting `crop_size` (in | |
`eval_dataset_options`) is slightly larger the largest image dimension in | |
the dataset. For example, we set eval_crop_size = 1025x2049 for Cityscapes | |
images whose image dimension is all equal to 1024x2048. | |
* During training, we could set k to be any integer as long as it fits to your | |
device memory. However, we notice a better performance when we have the same | |
`crop_size` during training and evaluation (i.e., also use whole-image crop | |
size during training). | |
________________________________________________________________________________ | |
**Q3: What output stride should I use in the backbone?** | |
**A3**: Using a different output stride leads to a different accuracy-and-memory | |
trade-off. For example, DeepLabv1 uses output stride = 8, but it requires a lot | |
of device memory. In DeepLabv3+ paper, we found that using output stride = 16 | |
strikes the best accuracy-and-memory trade-off, which is then our default | |
setting. If you wish to further reduce the memory usage, you could set output | |
stride to 32. Additionally, we suggest adjusting the `atrous_rates` in the ASPP | |
module as follows. | |
* If `backbone.output_stride` = 32, use `atrous_rates` = [3, 6, 9]. | |
* If `backbone.output_stride` = 16, use `atrous_rates` = [6, 12, 18]. | |
* If `backbone.output_stride` = 8, use `atrous_rates` = [12, 24, 36]. | |
Note that these settings may not be optimal. You may need to adjust them to | |
better fit your dataset. | |
________________________________________________________________________________ | |
**Q4: Why are the results reported by the provided evaluation code slightly | |
different from the official evaluation code (e.g., | |
[Cityscapes](https://github.com/mcordts/cityscapesScripts))?** | |
**A4**: In order to run everything end-to-end in the TensorFlow system (e.g., | |
the on-line evaluation during training), we re-implemented the evaluation codes | |
in TensorFlow. Additionally, our whole system, including the training and | |
evaluation pipelines, uses the panoptic label format (i.e., `panoptic_label = | |
semantic_label * label_divisor + instance_id`, where the `label_divisor` should | |
be larger than the maximum number of instances per image), instead of the JSON | |
[COCO formats](https://cocodataset.org/#format-data). These two changes along | |
with rounding and similar issues result in some minor differences. Therefore, | |
our re-implemented evaluation code is mainly used for TensorFlow integration | |
(e.g., the support of on-line evaluation in TensorBoard). The users should run | |
the corresponding official evaluation code in order to compare with other | |
published papers. Note that all the reported numbers in our papers are evaluated | |
with the official evaluation code. | |
To facilitate the conversion between prediction formats, we also provide | |
instructions for running the official evaluation codes on | |
[Cityscapes](setup/cityscapes_test_server_evaluation.md) and | |
[COCO](setup/coco_test_server_evaluation.md). | |
________________________________________________________________________________ | |
**Q5: What should I do, if I could not manage to compile TensorFlow along with | |
the provided efficient merging operation `merge_semantic_and_instance_maps`?** | |
**A5**: In this case, we provide another fallback solution, which implements the | |
merging operation with pure tf functions. This fallback solution does not | |
require any TensorFlow compilation. However, note that compared to our provided | |
TensorFlow merging operation `merge_semantic_and_instance_maps`, its inference | |
speed is slower and the resulting segmentation performance may also be slightly | |
lower. | |
To use the pure-tf-function version of `merge_semantic_and_instance_maps`, set | |
`merge_semantic_instance_with_tf_op` to `false` in your config's | |
`evaluator_options`. | |