Intriguing properties of neural networks, Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus, 2013arXiv preprint arXiv:1312.6199DOI: 10.48550/arXiv.1312.6199 - This seminal paper introduced the concept of adversarial examples, demonstrating that small, imperceptible perturbations can cause deep neural networks to misclassify.
Poisoning Attacks Against Support Vector Machines, Battista Biggio, Blaine Nelson, and Pavel Laskov, 2012Proceedings of the 29th International Conference on Machine Learning (ICML '12) (JMLR, Inc.) - A foundational work detailing data poisoning attacks, specifically targeting support vector machines, illustrating how training data can be manipulated to degrade model performance.
Membership Inference Attacks Against Machine Learning Models, Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov, 20172017 IEEE Symposium on Security and Privacy (SP) (IEEE)DOI: 10.1109/SP.2017.41 - This paper presented the first practical membership inference attack, showing how an adversary can determine if a specific data record was part of a model's training set.