基于DSSIM非范数约束增强的对抗训练方法

王保利* **; 范鑫鑫**; 景全亮* **; 毕经平**

文章摘要

王保利* **,范鑫鑫**,景全亮* **,毕经平**.基于DSSIM非范数约束增强的对抗训练方法[J].高技术通讯(中文),2023,33(4):339~351

基于DSSIM非范数约束增强的对抗训练方法

Improving adversarial training with DSSIM based non-norm constraint

DOI：10. 3772/ j. issn. 1002-0470. 2023. 04. 001

中文关键词: 对抗攻击；对抗防御；对抗训练（AT）；非范数约束

英文关键词: adversarial attack, adversarial defense, adversarial training（AT）, non-norm constraint

基金项目:

作者	单位
王保利* **	(中国科学院大学计算机科学与技术学院北京 100049) (*中国科学院计算技术研究所北京 100190)
范鑫鑫**	(中国科学院大学计算机科学与技术学院北京 100049) (*中国科学院计算技术研究所北京 100190)
景全亮* **	(中国科学院大学计算机科学与技术学院北京 100049) (*中国科学院计算技术研究所北京 100190)
毕经平**	(中国科学院大学计算机科学与技术学院北京 100049) (*中国科学院计算技术研究所北京 100190)

摘要点击次数: 1188

全文下载次数: 902

中文摘要:

针对当前对抗训练（AT）中存在的鲁棒过拟合问题，即在对抗训练超过一定轮次后，网络模型对抗防御能力出现不升反降的现象，本文提出了一种基于结构相异性非范数约束增强的对抗训练方法（DSSIM-AT）。该方法将非范数约束引入到对抗训练过程中用于对抗样本生成，根据样本间的结构相异度剔除对抗样本中的无语义特征，使得生成的对抗样本更适合于对抗训练。该方法进一步设计了梯度异步更新机制，优化对抗样本生成与模型参数更新耗时问题。实验结果表明，该方法可有效缓解对抗训练鲁棒过拟合情况，相比于已有对抗训练方法，可以将CIFAR 10数据集上的干净样本识别准确率提高约3%，同时对抗样本识别准确率提高约4%~8%。

英文摘要:

Aiming at the robust overfitting problem in the process of adversarial training (AT), i.e., the adversarial defense performance of the network model will not rise gradually but inversely fall to some extent with the increase of adversarial training rounds, this work proposes a novel adversarial training method that leverages a non-norm constraint based on structural dissimilarity, named DSSIM-AT. For the first time, non-norm constraints are introduced to remove non semantic features of generated adversarial examples from the structural dissimilarity perspective, making them more suitable for AT. The proposed method further designs a gradient asynchronous update mechanism, which optimizes the time-consuming of adversarial examples generation and model parameters update. The experimental results show that DSSIM-AT can effectively alleviate the robust overfitting problem. Compared with the existing baseline methods, the proposed DSSIM-AT can improve the recognition accuracy of clean examples on dataset CIFAR-10 by 3% approximately, while the recognition accuracy for adversarial examples can be improved by 4%-8%.

查看全文查看/发表评论下载PDF阅读器

关闭