Distilling Object Detectors with Fine-grained Feature Imitation

Motivation

检测起更focus在物体出现的区域 Detectors care more about local near object regions.
物体出现的周围特征变化其实包含了更多重要信息，这是student网络需要向teacher网络学习的

注解：
与分类不同，蒸馏方法在检测中如果进行全特征模仿的话对子网络的提升很有限(这里存疑，文章没有明确指出全特征模仿了哪些特征层)。
这可能是由于过多的无用背景anchor引入的噪音覆盖了来自teacher net的监督信息。文章认为检测器会关注目标区域以及其周边的位置，目标区域上的不同positive anchor之间的差异表现的就是teacher net对于检测目标的泛化特点。
Distilling Object Detectors with Fine-grained Feature Imitation

Framework

Distilling Object Detectors with Fine-grained Feature Imitation

Imitation region estimation

计算每一个GT box和该特征层上WxHxK个anchor的IOU得到IOU map m
找出最大值M=max(m)，乘以rψ作为过滤anchor的阈值: F = ψ ∗ M.
将大于F的anchor合并用OR操作得到WxH的feature map mask
遍历所有的gt box并合并获得最后总的mask
将需要模拟的student net feature map之后添加feature adaption层使其和teacher net的feature map大小保持一致。
加入mask信息得到这些anchor在student net中和在teacher net 中时的偏差作为imitation loss，加入到蒸馏的训练的loss中

Fine-grained feature imitation

student的特征图通道等可能和teacher不一致，我们可以在student的特征图后面加一个feature adaptation层进行对齐
即使student和teacher的feature map一致，我们发现加上feature adaptation层会比直接拉进student和teache的输出效果好

Here I is the imitation mask

Experiment

定量实验
Distilling Object Detectors with Fine-grained Feature Imitation
定性实验

Distilling Object Detectors with Fine-grained Feature Imitation

Motivation

Framework

Imitation region estimation

Fine-grained feature imitation

Experiment

相关推荐