As highlighted in the chapter introduction, while federated learning avoids direct sharing of raw data, the model updates themselves can still leak sensitive information about the client's underlying dataset. An adversary observing these updates might infer properties about the data or even reconstruct parts of it. To provide mathematically rigorous privacy guarantees against such threats, we turn to Differential Privacy (DP).
Differential Privacy offers a strong standard for privacy protection. At its core, it ensures that the outcome of an analysis (like computing an aggregate model update in FL) is statistically similar whether any particular individual's data is included in the dataset or not. This prevents an adversary from confidently inferring the presence or specific details of any single participant's data by observing the output.
Formally, a randomized algorithm M satisfies (ϵ,δ)-differential privacy if, for any two adjacent datasets D1 and D2 that differ by only one individual's data, and for any possible set of outputs S, the following inequality holds:
P[M(D1)∈S]≤eϵP[M(D2)∈S]+δLet's break down the components:
The intuition is that if ϵ and δ are small, an observer seeing the output M(D) cannot confidently determine whether any specific individual's data was used in the computation.
The most common way to make an algorithm differentially private is to inject carefully calibrated random noise into its output. The amount of noise required depends on how much the output could change if one individual's data were different. This maximum possible change is called the sensitivity of the function.
Consider a function f that maps datasets to real-valued vectors (like computing an average gradient).
To ensure the sensitivity is bounded, a common prerequisite in FL is gradient clipping. Before noise is added, each client's update vector ui is scaled down if its norm exceeds a predefined threshold C: ui←ui/max(1,∥ui∥2/C). This ensures that the maximum influence of any single client's update on the sum is bounded (e.g., the L2 sensitivity of the sum is at most C if we consider adding/removing one client's entire clipped update).
Two primary noise-adding mechanisms are widely used:
Differential privacy can be applied in FL in two main ways, differing in where the noise is added:
Central Differential Privacy (CDP):
Local Differential Privacy (LDP):
The choice between Central and Local DP involves a fundamental trade-off between the strength of the privacy guarantee (especially regarding trust in the server) and the impact on model performance.
Comparison of Central DP and Local DP workflows in Federated Learning. In CDP, noise is added centrally after aggregation. In LDP, noise is added locally by each client before sending updates.
Understanding these DP mechanisms and application models is foundational for building FL systems that offer quantifiable privacy protections alongside collaborative model training. The next sections will delve into applying these concepts, specifically adding noise to gradient updates and managing the privacy budget over multiple training rounds.
© 2025 ApX Machine Learning