Following the introduction to adversarial machine learning concepts, this chapter concentrates on evasion attacks. These are attacks performed during the model's inference phase, aiming to cause misclassifications by introducing carefully crafted perturbations to the input data. The objective is typically to find a small perturbation δ such that an input x is modified to xadv=x+δ, causing the model f to output an incorrect prediction, f(xadv)=f(x), while satisfying constraints on the perturbation size, often defined using an Lp norm like ∣∣δ∣∣p≤ϵ.
This chapter examines several advanced methods for generating these adversarial examples. We will analyze the progression from fundamental gradient-based attacks (like FGSM, BIM) to more effective iterative methods such as Projected Gradient Descent (PGD). You will study optimization-based approaches, exemplified by the Carlini & Wagner (C&W) attacks, which often find highly effective, low-distortion perturbations. We will also cover techniques applicable under limited attacker knowledge, including score-based attacks (using model confidence scores) and decision-based attacks (using only the final prediction label). Furthermore, we will investigate the transferability of attacks between different models and specific strategies for attacking ensemble models. The chapter concludes with a practical section where you will implement some of these advanced evasion attack techniques.
2.1 Gradient-Based Attacks: FGSM, BIM, PGD Analysis
2.2 Optimization-Based Attacks: Carlini & Wagner Methods
2.3 Score-Based Attack Techniques
2.4 Decision-Based Attack Techniques
2.5 Transferability of Adversarial Examples
2.6 Attacking Ensemble Models
2.7 Implementing Evasion Attacks: Hands-on Practical
© 2025 ApX Machine Learning