人群密度估计--Crowd Counting Via Scale-adaptive Convolutional Nerual Network
Crowd Counting Via Scale-adaptive Convolutional Nerual Network
https://arxiv.org/abs/1711.04433v2
Code: https://github.com/miao0913/SaCNN-CrowdCounting-Tencent_Youtu
为了解决人群密度估计中的 scale and perspective 问题,先前研究者提出使用 多尺度卷积网络来解决多尺度问题
Multiple columns have different receptive fields corresponding to pedestrians (heads) of different scales
这里我们提出一个 尺度自适应CNN网络,只使用 3 ∗ 3 滤波器,结合CNN网络不同网络层的特征
a scale-adaptive CNN (SaCNN) architecture with a backbone of fixed small receptive fields.
We use all 3 ∗ 3 filters in the network
输入输出图示
3 Scale-adaptive CNN
3.1. Ground truth density maps
每个人头我们使用一个 delta function 来表示,ground truth density map D(x) 由 delta function 和 一个 Gaussian kernel 卷积得到
N 表示图像中人头总数,
The sum of the density map is equivalent to the total number of pedestrians in a crowd
3.2. Network architecture
The final density map therefore has a spatial resolution of 1/8 times of the input image.
3.3. Network loss
Euclidean loss to measure the distance between the estimated density map and the ground truth
引入了一个新的损失函数,侧重于 解决图像中只有几个人的情况估计效果不好的问题
introduce another loss function regarding the head count
We notice that most representative approaches perform poorly on crowd scenes with few pedestrians.
原来的损失函数不能解决这个问题的原因:because the absolute pedestrian number is usually not very large in sparse crowds compared to that in dense crowds
4 Experiments
我们的新数据库:特点 人少
ShanghaiTech dataset
WorldExpo’10 dataset & UCF CC 50 dataset
SmartCity dataset
下面是和 YOLO9000 对比,各有所长