3. Our Method
3.1. Review of CycleGAN
给定来自两个domain的unpaired training samples x∈X,y∈Y,对于其从X到Y的mapping GX→Y,及其判别器DY,adversarial loss定义如下
LGAN(GX→Y,DY)=Ey[logDy(y)]+Ex[log(1−DY(GX→Y(x)))](1)
CycleGAN学习正向和反向的mapping,the cycle consistency
loss如下
Lcyc=∥GY→X(GX→Y(x))−x∥1+∥GX→Y(GY→X(y))−y∥1(2)
CycleGAN的total objective function定义如下
L(GX→Y,GY→X,DX,DY)=LGAN(GX→Y,DY)+LGAN(GX→Y,DY)+Lcyc(3)
本文定义X为real face domain,Y为cartoon face domain
3.2. Cartoon Face Landmark Assisted CycleGAN
3.2.1 Landmark Consistency Loss
Lc(G(X,L)→Y)=∥∥RY(G(X,L)→Y(x,l))−l∥∥2(4)
其中 l∈L是input landmark heatmap,R是一个预训练的U-Net,用于预测landmark heatmap,RY表示domain Y中的landmark regressor
公式(4)的含义为,对于real face image x及其landmark l,送入生成器G(X,L)→Y生成图像,对于生成的图像使用RY预测landmark,应该尽可能地与l接近

3.2.2 Landmark Matched Global Discriminator
如Figure 2所示,对于translation X→Y,unconditional global discriminator DY produces more realistic cartoon faces,conditional global discriminator DYgc aims to generate landmark-matched cartoon faces with landmark heat map l∈L as part of input
LGAN(G(X,L)→Y,DYgc)=Ey[logDY(y,l)]+Ex[log(1−DY(G(X,L)→Y(x,l),l))](5)

3.2.3 Landmark Guided Local Discriminator
在眼睛、鼻子、嘴巴的区域引入3个local discriminators,其adversarial loss定义如下
LGANlocalX→Y=i=1∑3λli⋅LGANpatch(G(X,L)→Y,DYli)=i=1∑3λli{Ey[logDYli(yp)]+Ex[log(1−DYli([G(X,L)→Y(x)]p))]}(6)
其中yp与[G(X,L)→Y(x)]p分别表示real cartoon image与generated cartoon image的local patch
3.3. Network Training
3.3.1 Two Stage Training
Stage I 首先在framework中去掉local discriminator训练100K iterations,得到coarse results
Stage II 使用pre-trained landmark prediction network对coarse images预测landmark,利用landmark提取local patch,送入local discriminator得到更精确的生成结果