deeplabv3+ 中crop size参数

deeplabv3+中train.py的输入参数如下：

python -u "${WORK_DIR}"/train.py \
  --logtostderr \
  --num_clones=8 \
  --train_split="train" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size=513 \
  --train_crop_size=513 \
  --train_batch_size=32 \
  --dataset="fly_1024" \
  --fine_tune_batch_norm=False \
  --initialize_last_layer=False \
  --last_layers_contain_logits_only=True \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/deeplabv3_cityscapes_train/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${PASCAL_DATASET}"

其中train_crop_size 最初一直无法理解，后来查询源码，解释如下：

crop_height: The height value used to crop the image and label.
crop_width: The width value used to crop the image and label.

本人理解

对于语义分割来说，经常输入图片为非常大的size，特别是遥感影像，如果直接输入网络会造成显存不够，而deeplabv3+的作者就设置了random crop from image，从图像中随机crop图像进行训练，避免整张图输入出现内容损失。

原文链接：https://blog.csdn.net/weixin_39610043/article/details/86646457