[KDD 2020] RayS: A Ray Searching Method for Hard-label Adversarial Attack
Aug 13, 20202 views
Deep neural networks are vulnerable to adversarial attacks. Among,different attack settings, the most challenging yet the most practical,one is the hard-label setting where the attacker only has access to,the hard-label output (prediction label) of the target model. Previous attempts are neither effective enough in terms of attack success,rate nor efficient enough in terms of query complexity under the,widely used,𝐿,∞,norm threat model. In this paper, we present the,Ray Searching attack (RayS), which greatly improves the hard-label,attack effectiveness as well as efficiency. Unlike previous works,,we reformulate the continuous problem of finding the closest decision boundary into a discrete problem that does not require any,zeroth-order gradient estimation. In the meantime, all unnecessary,searches are eliminated via a fast check step. This significantly,reduces the number of queries needed for our hard-label attack.,Moreover, interestingly, we found that the proposed RayS attack,can also be used as a sanity check for possible “falsely robust” models. On several recently proposed defenses that claim to achieve the,state-of-the-art robust accuracy, our attack method demonstrates,that the current white-box/black-box attacks could still give a false,sense of security and the robust accuracy drop between the most,popular PGD attack and RayS attack could be as large as,28%,. We,believe that our proposed RayS attack could help identify falsely,robust models that beat most white-box/black-box attacks.