While gradient-based and optimization-based attacks leverage detailed model information (gradients), and score-based attacks use output probabilities, there exists an even more restrictive scenario: what if the attacker only receives the final prediction label (e.g., "cat" or "dog") from the model? This is known as the hard-label black-box setting. Decision-based attacks are designed specifically for this challenging situation, making them relevant for attacking systems where output scores are withheld or quantized.
The fundamental difficulty lies in finding a direction to modify the input x to make it adversarial without any gradient or score information to guide the search. Decision-based methods essentially perform a clever search, often requiring many queries to the model to infer information about the location of the decision boundary.
One of the most prominent decision-based techniques is the Boundary Attack. It operates based on a simple yet effective intuition:
The core idea is to "walk" along the decision boundary between the original class and the target adversarial class. At each step t, the algorithm attempts to find a new point xadv(t+1) by taking a small step from xadv(t). This step is carefully chosen:
This iterative process gradually reduces the perturbation magnitude ∣∣xadv(t)−x∣∣p, finding adversarial examples that are progressively closer to the original input.
2d illustration of the Boundary Attack. Starting from an initial adversarial point (red), the attack iteratively moves along the decision boundary (dashed line) towards the original input (blue), maintaining the adversarial classification while reducing the perturbation distance.
A significant characteristic of decision-based attacks is their query complexity. Because they lack direct gradient or score information, they must query the model many times (often thousands or even millions) to make progress. Each query involves sending a candidate input to the model and observing the output label. This makes them potentially slow and expensive, especially if the model has rate limits or query costs.
While the Boundary Attack is foundational, other decision-based methods exist, such as:
These methods generally attempt to improve the search strategy to reduce the number of required queries.
Advantages:
Disadvantages:
Decision-based attacks represent the extreme end of the spectrum in evasion attacks concerning attacker knowledge. Understanding these methods is important for evaluating security in scenarios where model internals and outputs are heavily guarded. They demonstrate that even with minimal feedback, vulnerabilities can still be exploited, albeit often at a higher computational cost.
© 2025 ApX Machine Learning