Machine learning systems are susceptible to specific types of security failures. This chapter introduces the essential concepts needed to understand these vulnerabilities and the principles behind attacking and defending ML models.
We begin by reviewing common security weaknesses present in typical machine learning pipelines. You will learn how to define structured threat models, considering attacker objectives, knowledge, and capabilities. We will analyze the distinct attack opportunities available during the training phase compared to the inference phase.
The chapter then presents the mathematical basis for adversarial examples, explaining how small, often imperceptible, input perturbations, typically bounded by norms like Lp, can cause models to misclassify. We will establish a taxonomy for classifying different adversarial attacks and provide a high-level overview of the main categories of defense mechanisms developed to counteract them. This grounding prepares you for the detailed study of specific attack and defense techniques in subsequent chapters.
1.1 Review of Machine Learning Security Vulnerabilities
1.2 Threat Models in Machine Learning
1.3 Attack Surfaces: Training vs. Inference
1.4 Mathematical Formulation of Adversarial Examples
1.5 Taxonomy of Adversarial Attacks
1.6 Overview of Defense Strategies
© 2025 ApX Machine Learning