Finding Tiny Faces in the Wild with Generative Adversarial Network
Finding Tiny Faces in the Wild with Generative Adversarial Network
Yancheng Bai, Yongqiang Zhang, Mingli Ding, Bernard Ghanem
Abstract
task: detecting small faces in unconstrained conditions
challenges: lacking detailed information and blurring
solution: directly generate a clear high-resolution face from a blurry small one by adopting a generative adversarial network (GAN).
traditional method: super-resolving and refining sequentially
solution: design a novel network
new training losses to guide the generator network to recover fine details and to promote the discriminator network to distinguish real vs. fake and face vs. non-face simultaneously
Introduction
large and medium faces detection: good
small faces: far from satisfactory
difficulty: lack sufficient detailed information to distinguish them from the similar background; modern CNN-based face detectors use the down-sampled convolutional (conv) feature maps with stride 8, 16 or 32 to represent faces, losing most spatial information and are too coarse to describe small faces
traditional solution: directly up-samples images using bi-linear operation and exhaustively searches faces on the up-sampled images, increasing the computation cost and the inference time too; use the intermediate conv feature maps to represent faces at specific scales, the shallow but fine-grained intermediate conv feature maps lack discrimination, which causes many false positive results. take no care of other challenges
our solution: use GAN. generator = SRN + RN. super-resolution network(SRN) up-sample small faces to fine scale, reducing the artifact and improving the quality of up-sampled images with a large upscaling factors. refinement network (RN) recover some missing details in the up-sampled images and generate sharp high-resolution images for classification. discriminator sub-network utilize a new loss function that enforces the discriminator network to distinguish the real/fake face and face/non-face simultaneously, distinguish whether they are real images or generated high-resolution images and whether they are faces or non-faces.
contribution:
(1) GAN: generator = SRN + RN, discriminator multi-task
(2) new loss: promote the discriminator network to distinguish the real/fake image and face/non-face simultaneously
(3) state-of-the-art performance
Related Work
Face Detection
hand-crafted feature based methods: a single scale, restricts the performance of detectors
CNN-based methods + upsample by re-sizing input images to different scales during training and testing: inevitably increases memory and computation costs, generates the images with large structural distortions
our method: exploits the super-resolution and refinement network to generate clear and fine faces with high resolution
感觉这效果是不是太过了。。。而且有的地方把不是人脸的部位也判断为人脸了
Superresolution and Refinement Network
the first work trying to jointly super-resolve and refine the small blurry faces in the wild
Generative Adversarial Networks
super-resolution (SRGAN), blurry and lack fine details especially for low-resolution faces
extend the discriminator network to classify the fake vs. real and face vs. non-face simultaneously
Proposed Method
GAN
: low-resolution face candidates
: high-resolution face candidates
: label, 1 for face, 0 for non-face
generator:
discriminator: , distinguish the generated vs. true high-resolution images and faces vs. non-faces jointly
Network Architecture
SRN: takes the low-resolution images as the inputs and the outputs are the super-resolution images, usually blurring
RN: refine the super-resolution images
Loss Function
pixel-wise loss(generator): 类似自编码器的loss, , 其中分别表示SRN, RN
adversarial loss(discriminator):
Classification loss: , 不用softmax loss?
结合三个loss进行加权求和就得到最终的loss
这样的工作本人最近在MNIST上也做过,只不过并非对于超分辨任务,真是不谋而合!