Having explored the theoretical underpinnings of constrained optimization, including Lagrangian duality and KKT conditions, let's put the theory into practice by implementing the Projected Gradient Descent (PGD) algorithm. PGD is an intuitive extension of standard gradient descent designed to handle problems where the solution must lie within a specific feasible set C.
The core idea is simple: perform a standard gradient descent step, which might take you outside the feasible set, and then project the resulting point back onto C. This ensures that every iterate xk remains feasible.
Recall the PGD update rule:
The effectiveness of PGD often depends on how efficiently the projection ΠC can be computed. Fortunately, for many common constraint sets used in machine learning (like box constraints, norm balls, or probability simplexes), the projection can be calculated analytically and efficiently.
Let's consider minimizing a simple quadratic function subject to box constraints. This type of constraint ensures each variable xi stays within a lower bound li and an upper bound ui.
Problem: Minimize f(x1,x2)=(x1−5)2+(x2−4)2 Subject to: 0≤x1≤3 0≤x2≤2
The unconstrained minimum is clearly at (5,4). However, this point lies outside our feasible region, defined by the box C=[0,3]×[0,2]. We expect PGD to converge to a point on the boundary of this box.
The gradient is ∇f(x)=[2(x1−5),2(x2−4)].
The Projection Operator for Box Constraints: For a box defined by lower bounds l=[l1,...,ln] and upper bounds u=[u1,...,un], the projection ΠC(y) is computed element-wise: (ΠC(y))i=max(li,min(yi,ui)) This simply clips each component of y to fit within its corresponding interval [li,ui].
Let's implement PGD in Python using NumPy.
import numpy as np
import plotly.graph_objects as go
import json # Import json for final output formatting
# 1. Define Objective Function
def objective_function(x):
""" Objective: f(x1, x2) = (x1 - 5)^2 + (x2 - 4)^2 """
return (x[0] - 5)**2 + (x[1] - 4)**2
# 2. Define Gradient Function
def gradient(x):
""" Gradient: nabla f(x) = [2(x1 - 5), 2(x2 - 4)] """
return np.array([2 * (x[0] - 5), 2 * (x[1] - 4)])
# 3. Define Projection Operator for Box Constraints
def project_onto_box(y, lower_bounds, upper_bounds):
""" Projects point y onto the box defined by lower and upper bounds. """
return np.maximum(lower_bounds, np.minimum(y, upper_bounds))
# 4. Implement Projected Gradient Descent
def projected_gradient_descent(
start_point,
gradient_func,
project_func,
lower_bounds,
upper_bounds,
learning_rate=0.1,
max_iterations=100,
tolerance=1e-6
):
""" Performs Projected Gradient Descent. """
if not isinstance(start_point, np.ndarray):
x = np.array(start_point, dtype=float)
else:
x = np.copy(start_point).astype(float)
# Ensure starting point is feasible
x = project_func(x, lower_bounds, upper_bounds)
history = [np.copy(x)] # Store iteration history
print(f"Starting PGD from: {x}")
for i in range(max_iterations):
grad = gradient_func(x)
if np.linalg.norm(grad) < tolerance * 10: # Early check on gradient norm
print(f"Gradient norm small ({np.linalg.norm(grad):.2e}), potentially near optimum.")
# pass # Allow projection step anyway
# Gradient step
y_next = x - learning_rate * grad
# Projection step
x_next = project_func(y_next, lower_bounds, upper_bounds)
# Check for convergence (change in x)
step_size = np.linalg.norm(x_next - x)
if step_size < tolerance:
print(f"Converged after {i+1} iterations. Step size: {step_size:.2e}")
history.append(np.copy(x_next))
break
x = x_next
history.append(np.copy(x)) # Store feasible point
else: # No break from loop
print(f"Reached maximum {max_iterations} iterations.")
return x, np.array(history)
# --- Problem Setup ---
initial_x = np.array([0.0, 0.0]) # Start at a feasible corner
bounds_lower = np.array([0.0, 0.0])
bounds_upper = np.array([3.0, 2.0])
learn_rate = 0.1
iterations = 50
# --- Run the Algorithm ---
optimal_x, trajectory = projected_gradient_descent(
initial_x,
gradient,
project_onto_box,
bounds_lower,
bounds_upper,
learning_rate=learn_rate,
max_iterations=iterations
)
print(f"Constrained optimum found at: {optimal_x}")
print(f"Objective function value at optimum: {objective_function(optimal_x):.4f}")
# --- Visualization Setup ---
# Generate grid for contour plot
x1_vals = np.linspace(-1, 6, 50)
x2_vals = np.linspace(-1, 5.5, 50)
X1, X2 = np.meshgrid(x1_vals, x2_vals)
Z = objective_function([X1, X2])
# Create contour plot figure
fig = go.Figure(data=go.Contour(
z=Z, x=x1_vals, y=x2_vals,
colorscale='Blues',
contours=dict(coloring='lines', start=0, end=50, size=2.5),
line_width=1
))
# Add feasible region (box constraints) as a rectangle shape
fig.add_shape(
type="rect",
x0=bounds_lower[0], y0=bounds_lower[1],
x1=bounds_upper[0], y1=bounds_upper[1],
line=dict(color="#f03e3e", width=2), # Red border
fillcolor="rgba(250, 82, 82, 0.1)", # Light red fill
layer='below'
)
# Add optimization trajectory
fig.add_trace(go.Scatter(
x=trajectory[:, 0],
y=trajectory[:, 1],
mode='lines+markers',
name='PGD Path',
line=dict(color='#1c7ed6', width=2), # Blue line
marker=dict(color='#4263eb', size=5, symbol='circle-open')
))
# Highlight start and end points
fig.add_trace(go.Scatter(
x=[trajectory[0, 0]], y=[trajectory[0, 1]],
mode='markers', name='Start',
marker=dict(color='#37b24d', size=10, symbol='circle') # Green start
))
fig.add_trace(go.Scatter(
x=[trajectory[-1, 0]], y=[trajectory[-1, 1]],
mode='markers', name='Constrained Optimum',
marker=dict(color='#f76707', size=12, symbol='star') # Orange star end
))
# Add unconstrained optimum for reference
fig.add_trace(go.Scatter(
x=[5], y=[4],
mode='markers', name='Unconstrained Optimum',
marker=dict(color='#495057', size=10, symbol='x-thin', line=dict(width=2)) # Gray cross
))
# Update layout for clarity
fig.update_layout(
title_text='Projected Gradient Descent Optimization Path',
xaxis_title='Parameter x1',
yaxis_title='Parameter x2',
width=700, height=600,
xaxis=dict(range=[-1, 6], scaleanchor='y', scaleratio=1), # Maintain aspect ratio
yaxis=dict(range=[-1, 5.5]),
legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01, bgcolor='rgba(255,255,255,0.7)')
)
# Generate single-line JSON for embedding
plotly_json_string = json.dumps(fig.to_dict()) # Use json.dumps for single line
Visualization of PGD minimizing f(x1,x2)=(x1−5)2+(x2−4)2 subject to 0≤x1≤3 and 0≤x2≤2. The path starts at (0,0), moves towards the unconstrained minimum at (5,4), but is repeatedly projected back onto the feasible region (red box). It converges to the corner (3,2), the feasible point closest to the unconstrained minimum.
As the visualization shows, the PGD iterates move in the direction of the negative gradient but are forced back into the feasible box whenever a step takes them outside. The algorithm successfully finds the constrained minimum at (3,2), which is the point within the box closest to the unconstrained minimum (5,4), as expected for this quadratic objective.
Key points about this implementation:
trajectory
(after the initial projection of the start point) lies within the feasible box constraints.project_onto_box
function is computationally inexpensive, involving only element-wise min
and max
operations. The efficiency of this projection step is significant for the overall performance of PGD.This hands-on example demonstrates the core mechanics of Projected Gradient Descent. While simple, it forms the basis for tackling more complex constrained optimization problems encountered in areas like regularized regression (e.g., LASSO with non-negativity constraints), optimal control, and training models with specific parameter requirements. The primary challenge often lies in designing or finding an efficient projection operator for the specific constraint set C of interest.
© 2025 ApX Machine Learning