Having reviewed the fundamental workflow and challenges inherent in federated learning, particularly data and systems heterogeneity, we now establish a more rigorous mathematical framework. Defining the optimization problem precisely is essential for understanding, analyzing, and developing advanced federated algorithms.
At its core, federated learning aims to train a single global model using data distributed across numerous clients without centralizing that data. The objective is typically formulated as minimizing a global loss function, which represents an aggregation of individual loss functions computed locally on each client's data.
The standard goal in federated optimization is to find model parameters w that minimize a global objective function F(w). This function is usually defined as a weighted average of the local objective functions Fk(w) from each client k:
F(w)=k=1∑NpkFk(w)Let's break down the components of this equation:
The local objective function Fk(w) is typically the average loss of the model with parameters w over client k's local data Dk. For a supervised learning task with data points (xj,yj), where xj is the input feature vector and yj is the target label, Fk(w) can be expressed as:
Fk(w)=nk1j∈Dk∑ℓ(w;xj,yj)Here, ℓ(w;xj,yj) is the loss function chosen for the specific task, such as cross-entropy loss for classification or mean squared error for regression. It measures the prediction error for a single data point.
A critical aspect, revisited from the challenges section, is that the data distributions across clients (Dk) are often not independent and identically distributed (Non-IID). This statistical heterogeneity means that the local objective functions Fk(w) can differ significantly from one another. The optimal parameters for one client's data might perform poorly for another's.
The ultimate goal of the federated optimization process is to find the set of global parameters w∗ that minimizes the global objective function F(w):
w∗=argwminF(w)=argwmink=1∑NpkFk(w)Solving this minimization problem presents unique difficulties compared to traditional centralized machine learning:
Federated optimization algorithms, like the widely used Federated Averaging (FedAvg), are specifically designed to find an approximate solution w∗ under these constraints. They typically involve rounds of local computation on clients (e.g., performing multiple steps of stochastic gradient descent on the local objective Fk(w)) followed by aggregation of updates (e.g., model parameters or gradients) at the server to update the global model w.
This mathematical formulation provides a clear objective for federated learning. Understanding this objective is the first step towards appreciating the design and analysis of the advanced algorithms discussed in subsequent chapters, which aim to solve this problem more efficiently, robustly, and privately.
© 2025 ApX Machine Learning