PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer(CVPR20)
3. PSGAN
3.1. Formulation
source image domain , reference image domain
,
domain 上的分布,domain 上的分布
学习目标是一个transfer function ,包含的makeup style,以及的identity
3.2. Framework
Overall
PSGAN的framework如Fig. 2所示
- Makeup distill network(MDNet),从reference image 中提取makeup style,共有2个成分,称为makeup matrices
- Attentive makeup morphing module(AMM module),因为source image 和reference image 之间的expression和pose差异很大,所以提出AMM module,用于morph the two makeup matrices to two new matrices , which are adaptive to the source image by considering the similarities between pixels of the source and reference
- Makeup apply network(MANet),将作用在MANet的bottleneck feature map上
Makeup distill network(MDNet)
MDNet的网络结构为StarGAN的encoder-bottleneck部分(bottleneck指residual block),负责提取 the makeup related features(如唇彩、眼影等),这些feature被表示为2个makeup matrices
如Fig.2(B)所示,MDNet的输出为feature map ,后接2个并列的1x1 conv layer,得到
Attentive makeup morphing module(AMM module)
因为source image 和reference image 之间的expression和pose差异很大,所以不能直接将直接作用在 source image 上
Q:可以认为中仍然包含reference image 的expression和pose等信息吗?
AMM module计算一个attentive matrix to specify how a pixel in the source image is morphed from the pixels in the reference image ,where indicates the attentive value between the -th pixel in image and the -th pixel in image
理解:假设在中position 是眼角的位置,在中position 也是眼角的位置,那么的值应该比较大,意味着中position 的像素值应该参考中position 的像素值,才能实现较好的眼影迁移
(有个缺点,既然把和乘起来了,一定程度上丢失了spatial information)
引入68个facial landmarks作为anchor points
以鼻尖处的landmark为例,对于的所有position,计算该position 到鼻尖x的距离(有正有负),得到一个2维vector,于是所有68 landmark就可以得到136维向量,,称为relative position features
where and indicate the coordinates on and axes, indicates the -th facial landmark
思考:的维度应该是吧
既然是landmark,那么必然会存在face size的差异,因此令单位化,即
Moreover, to avoid unreasonable sampling pixels with similar relative positions but different semantics, we also consider the visual similarities between pixels
Fig.2©举了一个例子