CVPR2017: Learning Deep Context-aware Features over Body and Latent Parts for

作者是中科院的Dangwei Li等。这篇工作是multi-class person identification tasks,主要创新有三点:(1)用空洞卷积(dilated conv)进行多尺度特征提取,减少传统CNN提取特征的信息损失;(2)利用Spatial Transformer Networks (STN,其中作者设置了三个参数限制) 提取可变的body-part,相比较于rigid divid, 能减少背景的影响; (3)将full body特征和parts特征融合,在identification classification 指导下,学习网络参数。

Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.” arXiv preprint arXiv:1511.07122 (2015).

Part 4 Experiments

4.1 Datasets and protocols

Datasets: 第一段先整体介绍下哪些数据集上进行实验,其他段再分数据集单独介绍。

4.2 Implementation Details

分为Model , optimization 和 Data preprocessing 三部分
Model: 指出除了总的网络外,为了单独分析full body and body parts,抽出了两个sub models.
Optimization: 在caffe上实现,BP计算梯度,学习率等。
Data preprocessing :160x60, 1.0/256, image horizontally reflect

4.3 Comparison with state-of-the-art methods


  • For the CUHK03 dataset, we compare our method with many existing
    approaches, including XX, XX…..
  • Compared with XXXX, such as XXX, the proposed XXXX improves the Rank-1 identification rate by 11.66% and 13.29% on the labeled and detected datasets respectively.
  • Compared with XXXXX, our XXXX improves the Rank-1 identification rate by 2.93% and mAP by 4.22%.

4.4 Effectiveness of MSCAN(多尺度网络)

To determine the effectiveness of …., we explore four variants of … to learn IDE feature based on the whole body image.
选择一个数据集Market1501进行评估,分别设定dilated ratio 为1,2,3,4,指出3是个合适的选择(4的时候比3进展了一点点)

4.5 Effectiveness of Latent part location

(1)Learned parts vs. rigid parts:选择Market1501比较学习到部分和硬性指定的部分对结果的影响。
(3)Effectiveness of location loss 评估约束的作用。

4.6 Cross-dataset Evaluation


关于dilated conv, 摘自知乎:


