Having examined attacks that manipulate model inputs (evasion) or the training process (poisoning), we now turn to methods that extract information from or about a trained model. This chapter addresses attacks targeting the confidentiality of the model itself and the data used to train it, often requiring only standard query access.
You will study several inference techniques:
These attacks directly relate to data privacy. Understanding them is necessary for evaluating the potential information leakage from deployed models. We will also consider how these attacks connect to formal privacy concepts like Differential Privacy. By the end of this chapter, you will grasp the principles behind these inference methods and their security implications.
4.1 Membership Inference Attacks: Theory and Methods
4.2 Attribute Inference Techniques
4.3 Model Inversion and Reconstruction Attacks
4.4 Model Stealing: Functionality Extraction Methods
4.5 Relationship to Differential Privacy
4.6 Implementing Membership Inference: Hands-on Practical
© 2025 ApX Machine Learning