Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, Anish Athalye, Nicholas Carlini, David Wagner, 2018Proceedings of the 35th International Conference on Machine Learning (ICML), Vol. 80 (PMLR)DOI: 10.5555/3305381.3305469 - A core paper that systematically identifies and categorizes gradient obfuscation as a common failure mode in many adversarial defenses. It describes adaptive attacks to bypass these defenses, influencing adversarial robustness research.
Towards Evaluating the Robustness of Neural Networks, Nicholas Carlini, David A. Wagner, 20172017 IEEE Symposium on Security and Privacy (SP) (IEEE Computer Society)DOI: 10.1109/SP.2017.49 - Introduces a set of strong, optimization-based adversarial attacks (C&W attacks) designed to overcome gradient masking and other defensive strategies. This work underscored the need for rigorous, adaptive evaluation.
Gradient Estimation for Black-Box Adversarial Attacks, Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin, 2018Proceedings of the International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1804.08598 - Discusses techniques for estimating gradients in black-box scenarios, including Backward Pass Differentiable Approximation (BPDA), a method for creating adaptive attacks against defenses that rely on non-differentiable or gradient-masking operations.