Learning Transferable Features with Deep Adaptation Networks

经典文章DAN


             Learning Transferable Features with Deep Adaptation Networks

总函数:  Learning Transferable Features with Deep Adaptation Networks

上面的公式中,J函数是一组有标签样本的损失,dk2是第l层的mk-mmd距离。

总函数调整的参数是θ,应该是1-8层(1-3层是固定的,4-5是fine-tune,6-8层是learn)

???fine-tune、learn的区别

Learning Transferable Features with Deep Adaptation Networks

kernel parameter β是怎么学习的?Learning Transferable Features with Deep Adaptation Networks

本文的创新点:

(参考:对于DAN方法的解读-Learning Transferable Features with Deep Adaptation Networks

1、多层适配

2、MK-MMD