Generative Adversarial Network 3 WGAN/ EBGAN

前提回顾

JS 散度 Jensen-Shannon Divergence

解决了两个概率分布的相似度,值0-1之间
Generative Adversarial Network 3 WGAN/ EBGAN
但是如果P,Q离得很远,完全没有重叠的时候,KL散度值是没有意义的,JS散度值是个常数,这就意味着这一点梯度为0.

JS divergence is not suitable

  • in most case,PGandPdata are not overlapped
    1.PGandPdata are low-dimension manifold in high-dimension space
    2.even PGandPdata are overlap, if you do not have enough sampling

What is the problem of JS divergence

JS divergence is log2 if two distributions do not overlap
same objective value

一.Wasserstein GAN(WGAN)

Earth Mover’s Distance

Generative Adversarial Network 3 WGAN/ EBGAN

  • there are many possible “moving plans”
  • Using the “moving plan” with the smallest average distance to define the earth mover’s distance

Generative Adversarial Network 3 WGAN/ EBGAN

why earth mover distance

Evaluate wasserstein distance between PGandPdata
Generative Adversarial Network 3 WGAN/ EBGAN
discriminator must be smooth
为了使D 不会变成无穷大或者无穷小

Lipschitz Function
f(x1)f(x2)Kx1x2 \parallel f(x_1)-f(x_2)\parallel\leq K\parallel x_1-x_2\parallel

  • 保证output差距不会太大
  • 所以K=1 for “1-Lipschitz”

How to fulfill this constraint

1.WGAN

Improved WGAN (WGAN-GP)

  • D为1-Lipschitz 和 对Dx(x)中所有求x的倒数都小于1
  • 妥协:不能保证所以x倒数都小于1,就保证penalty中的小于1
    Generative Adversarial Network 3 WGAN/ EBGAN

Only give gradient constraint to the region betweenPGand Pdata,because they influence how PG move to Pdata.
Generative Adversarial Network 3 WGAN/ EBGAN

2.spectrum norm

spectral normalization
keep gradient norm smaller than 1 everywhere

The algorithm of WGAN

Generative Adversarial Network 3 WGAN/ EBGAN

二.Energy-based GAN (EBGAN)

Generative Adversarial Network 3 WGAN/ EBGAN

  • discriminator 可以提前训练,只用positive的样本就行
  • do not have to be very negative 因为实际减小是很难的,设定一个阈值就可以