Knowledge Distillation for Segmentation 笔记

目录

Inter-Region Affinity Distillation for Road Marking Segmentation (2020.04)

Learning Lightweight Lane Detection CNNs by Self Attention Distillation (2019.08)

Knowledge Adaptation for Efficient Semantic Segmentation(CVPR 2019)

Inter-Region Affinity Distillation for Road Marking Segmentation (2020.04)

Yuenan Hou1, Zheng Ma2, Chunxiao Liu2, Tak-Wai Hui1, and Chen Change Loy3y
1The Chinese University of * 2SenseTime Group Limited 3Nanyang Technological University
 

Knowledge Distillation for Segmentation 笔记

用 inter-region affinity graph 描述 structural relationship。
每个node表示一种class的 areas of interest (AOI)(同类算一个 or instance算一个?),edge 代表 Affinity

Knowledge Distillation for Segmentation 笔记
Generation of AOI:smooth the label map with an average kernel φ and AOI map as Knowledge Distillation for Segmentation 笔记
AOI-grounded moment pooling:分别描述 mean, variance, skewness
Knowledge Distillation for Segmentation 笔记Knowledge Distillation for Segmentation 笔记

Inter-region affinity :
Knowledge Distillation for Segmentation 笔记

distillation:

Knowledge Distillation for Segmentation 笔记     Knowledge Distillation for Segmentation 笔记

Knowledge Distillation for Segmentation 笔记

Knowledge Distillation for Segmentation 笔记

Experiments

Knowledge Distillation for Segmentation 笔记Knowledge Distillation for Segmentation 笔记Knowledge Distillation for Segmentation 笔记

Learning Lightweight Lane Detection CNNs by Self Attention Distillation (2019.08)

Yuenan Hou1, Zheng Ma2, Chunxiao Liu2, and Chen Change Loy3y
1The Chinese University of * 2SenseTime Group Limited 3Nanyang Technological University

用于lane det 的self attention distillation,
backbone:Enet、Resnet18/34
每个block输出的feature map转为attention map,后层attention map监督指导前层的attention map。
生成attention map:
Knowledge Distillation for Segmentation 笔记Knowledge Distillation for Segmentation 笔记 -> Bilinear upsampling B(.) -> spatial softmax operation Φ(.)

Knowledge Distillation for Segmentation 笔记

Loss
Knowledge Distillation for Segmentation 笔记  Knowledge Distillation for Segmentation 笔记m为block数,Ld为L2 loss

Knowledge Distillation for Segmentation 笔记

Ablation Study
Distillation paths of SAD:SAD用于block 1起反作用,造成low-level info细节损失?
Backward distillation.:后层为student,前层为teacher不行
SAD v.s. Deep Supervision:soft target,feedback connection
When to add SAD:adding SAD in later training stage would benefit

Knowledge Adaptation for Efficient Semantic Segmentation(CVPR 2019)

Tong He1 Chunhua Shen1y Zhi Tian1 Dong Gong1 Changming Sun2 Youliang Yan3
1The University of Adelaide 2Data61, CSIRO 3Noah’s Ark Lab, Huawei Technologies

motivation:Teacher和Student间结构差异,使得abilities to capture context and long range dependencies不同,给直接蒸馏造成难度。应该将knowledge去冗去噪后再用于蒸馏。

Knowledge Distillation for Segmentation 笔记

Knowledge Translation
用auto-encoder压缩feature  Knowledge Distillation for Segmentation 笔记

Feature Adaptation(有Fitnet的影子)
solve the problem of feature mismatching and decrease the effect of the inherent network difference of two model
Knowledge Distillation for Segmentation 笔记
Cf uses a 3 × 3 kernel with stride of 1, padding of 1, BN layer and ReLU

Affinity Distillation
cosine距离表示相似度
Knowledge Distillation for Segmentation 笔记
Knowledge Distillation for Segmentation 笔记

Knowledge Distillation for Segmentation 笔记

backbone: T: resnet50、 S:mobilenetV2