CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

论文链接地址https://www.researchgate.net/publication/332267080_Distilled_Person_Re-Identification_Towards_a_More_Scalable_System

1. 简介:

从三个方面考虑行人重识别问题的泛化能力:

  1. 降低标签代价–减少标签数
    现有的有监督Re-ID方法需要大量的有标签的人物,而一个泛化力好的Re-ID系统能够从未标记的数据和有限的标记数据中学习。

  2. 降低扩展代价–重新利用已有的知识
    当扩展到新的场景时,现有的Re-ID方法应用迁移学习,需要辅助的源域数据用于预先训练或者是联合学习。预训练的模型可能并不适用在不同的uer-specified需求。

  3. 降低测试计算代价–使用轻量级模型
    现有的Re-ID方法基于大规模的神经网络,比如ResNet-50。
    提出了Muti-teacher Adaptive Similarity Distillation Framework,只需要在目标域中提供少量的带标签的人物,就能从mutiple teacher model多老师模型迁移知识到轻量级的student model学生模型,没有从源域中获取数据。提出了Log-Euclidean Similarity Distillation Loss,更进一步的融合了Adapative Knowledge Aggregator自适应知识整合器去选择有效的老师模型迁移目标域自适应的知识。
    CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

2. 创新点:

(1) 提出了Log-Euclidean Similarity Distillation Loss用于Re-ID的知识蒸馏。
(2) 提出了Adapative Knowledge Aggregator能够从多老师模型中整合有效的信息到轻量级的学生模型。
(3) 将上述两种方法融合到Muti-teacher Adaptive Similarity Distillation Framework,可以同时降低标签代价,扩展代价,测试计算代价。
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

3. Similarity Knowledge Distillation

在许多知识蒸馏方法中,整合了软标签,但是对于Re-ID问题而言并不合适,因为Re-ID是一个开集合的身份识别问题,即在训练与测试集中没有重复身份的人。
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

3.1. Construction of Similarity Matrices

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

Properties of Student Similarity Matrix.

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

3.2. Log-Euclidean Similarity Distillation

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

4. Learning to learn from Multiple Teachers

4.1. Multiteacher Adaptive Aggregated Distillation

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

4.2. Adaptive Knowledge Aggregation

Validation Empirical(经验) Risk

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
Adaptive Knowledge Aggregator
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

5.实验

使用Market-1501 和 DukeMTMC 作为目标域数据。用带标签的数据集即MSMT17, CUHK03 ,ViPER , DukeMTMC和 Market-1501,训练5个老师模型T_1~T_5。
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
对于老师模型,采用PCB模型。对于学生模型,采用轻量级的模型MobileNetV2,并采用一个卷积层用于减少最后特征图的通道数为256,在ImageNet上预训练,并没有在任何的Re-ID数据集上训练。输入的图像调整为384*128,最后一个卷积层提取的特征图作为特征向量。每个batch,为了计算validation empirical risk,在标签数据中对每个身份采样两张图像以确保含有正样本对。
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
消融实验
Evaluation of Knowledge Distillation.

CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
Hinton et al比其它方法都低的原因在于它是通过软标签为闭集合分类问题设计。不适合开集合的Re-ID问题。PKT用概率分布蒸馏知识,对于Re-ID问题并没有本方法有效。
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
Effect of the Learned Teacher Weights α_i
Individual Teacher.

对于Market-1501,老师T1 (MSMT17), T2 (CUHK03) 和 T4 (DukeMTMC)都是有效的。
对于DukeMTMC,只有老师T1 (MSMT17)是有效的。因为DukeMTMC更具挑战性,与Market-1501相比有更多的摄像机视角。
对于两个数据集,T3(ViPER)是最差的因为它的训练集是最小的,提供的信息较弱。

- w/ and w/o Learningα_i
通过Adaptive Knowledge Aggregator学习的老师权重可以显示老师模型的有效性。对于最差的老师T3,权重接近于0。在Market-1501中,Ours(semi)与Ours(unsupervised),老师权重的选择只有有限的替身。在DukeMTMC中,这种提升更加明显,因为对于Market-1501,距离融合使用同样的权重比单独的老师好,但是在DukeMTMC中没有那么有效。距离融合会导致测试计算量增加,不如本方法的泛化力。

Comparison with Ensemble and Task Weighting.
本方法比ensemble, joint training 和 task weighting优越。测试计算代价比ensemble方法低,训练时间比joint training短。

The Number of Validation IDs.
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
验证身份数量从0到50。5-50性能较好。比较使用0 ID和1 ID,在DukeMTMC上性能下降明显,显示了validation empirical risk的重要性。

Different Student Model Architectures.
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记

Finetuning with Our Method as Initialization.
CVPR2019--Distilled Person Re-identification: Towards a More Scalable System阅读笔记
当给定更多的带标签数据,MobileNetV2学生模型可以用于fintuing的初始化。可以看到本方法在20% IDs上的fintuing结果与在ImageNet上使用100% IDs结果相当。