【对抗攻击阅读笔记】泰勒展开!进击の信任域!Trust Region Based Adversarial Attack on Neural Networks

论文链接:https://arxiv.org/abs/1812.06371?context=cs.LG

1 核心思想:

计算扰动前后的标签概率差扰动带来的系统泰勒展开差值的比例,来判断当前约束大小是否合适。如果比例大,说明这个区域可以信任,继续增大扰动量,反之减小扰动量。
【对抗攻击阅读笔记】泰勒展开!进击の信任域!Trust Region Based Adversarial Attack on Neural Networks
其中分母为:
【对抗攻击阅读笔记】泰勒展开!进击の信任域!Trust Region Based Adversarial Attack on Neural Networks

2 工作:

将对抗攻击问题转化为TR优化问题,结果优于FGSM和deepfool,扰动小。(但deepfool速度更快)
TR-based攻击在迭代中自动调节扰动量级,不用人为调参。

3 算法:

【对抗攻击阅读笔记】泰勒展开!进击の信任域!Trust Region Based Adversarial Attack on Neural Networks

4 结果:

【对抗攻击阅读笔记】泰勒展开!进击の信任域!Trust Region Based Adversarial Attack on Neural Networks