Poisoning Attacks against Support Vector Machines, Battista Biggio, Blaine Nelson, and Pavel Laskov, 2012Proceedings of the 29th International Conference on Machine Learning (ICML), Vol. JMLR Workshop and Conference Proceedings 2012 (Omnipress) - A seminal work on data poisoning, demonstrating how an attacker can inject malicious training samples to degrade a model's performance (availability attack).
Understanding Black-box Predictions via Influence Functions, Pang Wei Koh and Percy Liang, 2017Proceedings of the 34th International Conference on Machine Learning, Vol. 70 (PMLR)DOI: 10.5555/3305890.3305963 - Introduces influence functions as a method to understand the impact of individual training data points on model predictions, relevant for analyzing poison data effects.
Network Dissection: Quantifying Interpretability of Deep Visual Representations, David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba, 2017Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE)DOI: 10.1109/CVPR.2017.693 - Proposes a method for quantitatively evaluating the interpretability of individual neurons in deep neural networks, applicable for analyzing internal representation changes due to backdoors.