Adversarial training (AT) is one of the most effective defenses to improve the adversarial robustness of deep learning models. In order to promote the reliability of the adversarially trained models, we propose to boost AT via incorporating hypersphere embedding (HE), which can regularize the adversarial features onto compact hypersphere manifolds. We formally demonstrate that AT and HE are well coupled, which tunes up the learning dynamics of AT from several aspects. We comprehensively validate the effectiveness and universality of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the FreeAT and FastAT strategies. In experiments, we evaluate our methods on the CIFAR-10 and ImageNet datasets, and verify that integrating HE can consistently enhance the performance of the models trained by each AT framework with little extra computation.
Speakers: Tianyu Pang, Xiao Yang, Yinpeng Dong, Kun Xu, Hang Su, Jun Zhu