Understanding the potential vulnerabilities of a federated learning system is essential before designing robust defenses. A threat model defines the capabilities and goals of potential adversaries, helping us analyze risks and evaluate countermeasures. In FL, the distributed nature introduces unique attack surfaces compared to traditional centralized machine learning.
We typically categorize threats based on the adversary's position within the system and their objectives. Adversaries might be malicious clients participating in the training, the central server coordinating the process, or even external eavesdroppers intercepting communication. Their goals can range from degrading the global model's performance to extracting sensitive information about participants' private data.
Let's examine the primary locations from which an adversary might operate:
Clients participating in the FL process are inherently trusted to follow the protocol. However, some clients might be malicious or compromised. These "insider" adversaries control their local data and computation. Their capabilities typically include:
A common assumption is that the adversary controls a fraction of the total clients, often denoted by f. The adversary might coordinate these malicious clients (collusion) or act independently.
The central server, while not having direct access to raw client data, orchestrates the FL process. A compromised or malicious server possesses significant power:
A common model here is the "honest-but-curious" server. This server follows the FL protocol correctly but attempts to infer information from the legitimate updates it receives. A fully malicious server might actively tamper with the process.
An attacker outside the core FL system (clients and server) might attempt to intercept communication between clients and the server.
Modern FL systems often assume secure communication channels (e.g., TLS/SSL), making passive eavesdropping on raw updates less of a primary concern compared to threats from malicious participants or the server itself. However, metadata leakage (e.g., timing, frequency of updates) might still be possible.
Adversaries can exist as malicious clients, a compromised server, or external eavesdroppers, each targeting different parts of the FL process.
Based on their capabilities and position, adversaries pursue different goals:
Primarily orchestrated by malicious clients, poisoning attacks aim to corrupt the training process or the final global model.
Detecting and mitigating poisoning attacks is challenging, especially in heterogeneous (Non-IID) settings where deviating updates might resemble legitimate updates from clients with unusual data distributions. Robust aggregation rules, discussed in Chapter 2, are a primary defense mechanism.
These attacks aim to extract sensitive information about clients' private data, often perpetrated by a curious server or potentially other clients or eavesdroppers if updates are not properly secured. Since raw data ideally never leaves the client device, attackers try to infer information from the shared model updates (gradients or weights).
These attacks highlight that simply not sharing raw data is insufficient for privacy. The model updates themselves carry information that can be exploited. Privacy-enhancing techniques like Differential Privacy (DP) and Secure Multi-Party Computation (SMC), covered in detail in Chapter 3, are designed to formally limit such information leakage.
Understanding these threat models is fundamental. When designing or analyzing advanced FL techniques (like new aggregation rules, privacy mechanisms, or communication strategies), we must constantly evaluate their resilience against these potential attacks. The assumptions made about the adversary (e.g., their computational power, knowledge of the system, fraction of controlled clients) significantly influence the effectiveness and applicability of different defense mechanisms.
© 2025 ApX Machine Learning