Deep Identity-aware Transfer of Facial Attributes

网络分为两部分,第一部分为face transform network,得到生成图像,该网络还包括一个判别网络用于判别输入图像的真假,以及一个VGG-Face Netowork,用于判别输入图像的性别,即identity loss.

利用face transform network得到的生成图像比较模糊,因此将生成图像输入一个enhancement network,得到增强图像.

网络结构如下:

face transform network

参数如下:

face transform network由3个卷积层,5个residual block(每个residual block包含2个卷积层),再加上2个反卷积,一个卷积层得到,具体参数如下.

Deep Identity-aware Transfer of Facial Attributes

Identity loss用于生成图像与输入图像身份为相同,即为同一个人.由于不能使生成图像与输入图像像素完全相同,因此采用VGG网络提取生成图像,输入图像的特征,并计算他们的特征平方,

Deep Identity-aware Transfer of Facial Attributes
l表VGG第l层的输出.利用一个卷中wl组合各个层的平方误差,得到Identity loss,

Deep Identity-aware Transfer of Facial Attributes

Attribute loss是为了使生成图像与目标图像尽量相似,即满足同一分布,将生成图像输入判别网络,Attribute loss为:

Deep Identity-aware Transfer of Facial Attributes

判别网络参数为:

Deep Identity-aware Transfer of Facial Attributes

引入Perceptual regularization项是用来去除生成图像的噪声,保持边缘的同时,图像尽量平滑.

对于含噪图像g(n)=x+n,n表示噪声,训练一个去噪网络,将噪声n从图像g中分离处理,得到清晰图像x.去噪网络为一个包含2个卷积层,卷积核为3×3的网络,损失函数为:

Deep Identity-aware Transfer of Facial Attributes

得到去噪网络后,便可以构造Perceptual regularization损失函数:

Deep Identity-aware Transfer of Facial Attributes

整体的目标函数为:

Deep Identity-aware Transfer of Facial Attributes

identity and attribute losses都定义为高层的特征表示,使得GAN的训练难以收敛,这些可能导致无法生成高质量的图像.因此,本文引入enhancement network对生成图像进行增强处理,以得到更加清晰的图像.

给定 attribute mask m,我们希望对于没有改变的区域,图像与输入图像尽量相似,对于改变的图像区域,我们希望增强图像尽量与生成图像相似,既有损失函数:

Deep Identity-aware Transfer of Facial Attributes

对于global attribute,首先利用高斯滤波得到模糊图像,再对模糊图像进行增强:

Deep Identity-aware Transfer of Facial Attributes

Deep Identity-aware Transfer of Facial Attributes

将adaptive perceptual identity loss加入DAN得到损失函数:

Deep Identity-aware Transfer of Facial Attributes