Okay, let's put the theory of Singular Value Decomposition into practice. In the previous sections, we explored how SVD breaks down a matrix A into the product A=UΣVT. This decomposition is incredibly useful for understanding matrix properties, dimensionality reduction, and even data compression. Here, we'll use Python's NumPy library to perform SVD and see how it works on actual data.
First, ensure you have NumPy installed. If you're working in a standard scientific Python environment, you likely already have it. We'll primarily use the linalg
submodule within NumPy. Let's import it:
import numpy as np
# We might use matplotlib later if visualizing images, but for now, just NumPy.
# import matplotlib.pyplot as plt
# from scipy import misc # Example for loading images
print(f"Using NumPy version: {np.__version__}")
Let's start with a basic matrix and compute its SVD.
# Define a simple 2x3 matrix
A = np.array([[1, 2, 3],
[4, 5, 6]])
print("Original Matrix A:")
print(A)
# Perform SVD
# 'full_matrices=False' computes the 'thin' SVD, which is often more efficient.
# U will have shape (m, k), s will have shape (k,), Vh will have shape (k, n)
# where k = min(m, n)
U, s, Vh = np.linalg.svd(A, full_matrices=False)
print("\nU matrix (Left Singular Vectors):")
print(U)
print(f"Shape of U: {U.shape}")
print("\nsingular values (diagonal of Sigma):")
print(s)
print(f"Shape of s: {s.shape}")
print("\nVh matrix (V transpose, Right Singular Vectors Transposed):")
print(Vh)
print(f"Shape of Vh: {Vh.shape}")
Notice that np.linalg.svd
returns the singular values s as a one-dimensional array. To reconstruct the original matrix A, we need to form the diagonal matrix Σ from these singular values. The dimensions must align correctly for matrix multiplication: U is (m×k), Σ must be (k×k), and VT (which is Vh
) is (k×n), where k=min(m,n).
Let's reconstruct A from U, s, and Vh to verify the decomposition. We need to create the Σ matrix. Since s
only contains the diagonal elements, we first create a zero matrix of the correct shape (k×k) and then fill the diagonal.
# Create the Sigma matrix from the singular values s
# k is the number of singular values, which is min(m, n)
k = len(s)
Sigma = np.zeros((k, k))
Sigma[:k, :k] = np.diag(s)
print("\nConstructed Sigma matrix:")
print(Sigma)
print(f"Shape of Sigma: {Sigma.shape}")
# Reconstruct A using U @ Sigma @ Vh
# The '@' operator performs matrix multiplication
A_reconstructed = U @ Sigma @ Vh
print("\nReconstructed Matrix A:")
print(A_reconstructed)
# Check if the reconstructed matrix is close to the original
# np.allclose checks element-wise equality within a tolerance
print(f"\nReconstruction successful? {np.allclose(A, A_reconstructed)}")
As you can see, the reconstructed matrix is extremely close to the original A, confirming that A=UΣVT. The small differences are due to floating-point arithmetic.
One of the most significant applications of SVD is creating low-rank approximations of a matrix. This is achieved by keeping only the largest k singular values and their corresponding singular vectors. The idea is that the largest singular values capture the most "important" information or variance in the data represented by the matrix.
Let's try this on a slightly larger matrix.
# Create a slightly larger matrix, perhaps with some dependencies
B = np.array([[1, 2, 3, 4],
[2, 4, 6, 8],
[5, 6, 7, 8],
[9, 8, 7, 6]])
print("\nOriginal Matrix B:")
print(B)
# Perform SVD on B
U_b, s_b, Vh_b = np.linalg.svd(B)
print("\nSingular values of B:")
print(s_b)
# Choose the number of singular values to keep (rank for approximation)
rank_k = 2 # Let's try a rank-2 approximation
print(f"\nApproximating B using rank k={rank_k}")
# Construct the truncated Sigma_k matrix
Sigma_k = np.zeros_like(B, dtype=float) # Start with a zero matrix of B's shape
Sigma_k[:rank_k, :rank_k] = np.diag(s_b[:rank_k]) # Fill top-left with k singular values
print("\nTruncated Sigma matrix (Sigma_k):")
print(Sigma_k)
# Reconstruct the rank-k approximation
B_approx = U_b @ Sigma_k @ Vh_b
print("\nRank-k Approximated Matrix B_approx:")
print(B_approx)
# Calculate the difference (approximation error)
error = np.linalg.norm(B - B_approx, 'fro') # Frobenius norm of the difference
print(f"\nApproximation error (Frobenius norm): {error:.4f}")
# Let's try rank-3
rank_k_3 = 3
Sigma_k_3 = np.zeros_like(B, dtype=float)
Sigma_k_3[:rank_k_3, :rank_k_3] = np.diag(s_b[:rank_k_3])
B_approx_3 = U_b @ Sigma_k_3 @ Vh_b
error_3 = np.linalg.norm(B - B_approx_3, 'fro')
print(f"\nApproximation error for rank k=3: {error_3:.4f}")
Notice how the singular values decrease. The first two singular values are much larger than the others, suggesting that a rank-2 approximation might capture a significant portion of the matrix structure. The approximation error decreases as we increase the rank k. By choosing k<rank(B), we effectively reduce the dimensionality needed to represent the core information in B.
A classic example demonstrating the power of SVD approximation is image compression. A grayscale image can be represented as a matrix where each element is a pixel intensity. Color images are typically represented by three such matrices (Red, Green, Blue).
Let's consider a grayscale image matrix. By performing SVD on this matrix, we often find that the singular values decrease rapidly. This means we can potentially reconstruct a visually acceptable version of the image using only a fraction of the original singular values (and corresponding vectors).
While we won't load and display images directly here, let's simulate the process and visualize the singular value decay for a representative matrix. Imagine this represents a simplified image.
# Generate a matrix (e.g., simulating image data)
# Let's use a matrix where structure might be captured by few components
np.random.seed(42)
m, n = 50, 60
base = np.random.rand(m, 2) @ np.random.rand(2, n) # Low-rank base
noise = np.random.rand(m, n) * 0.1 # Add some noise
image_matrix = base + noise
# Perform SVD
U_img, s_img, Vh_img = np.linalg.svd(image_matrix, full_matrices=False)
# Plot the singular values
num_singular_values = len(s_img)
singular_value_indices = np.arange(1, num_singular_values + 1)
# Create a Plotly chart definition for singular values
plotly_singular_values_chart = {
"data": [{
"type": "line",
"x": singular_value_indices.tolist(),
"y": s_img.tolist(),
"mode": "lines+markers",
"marker": {"color": "#228be6"},
"line": {"color": "#228be6"}
}],
"layout": {
"title": "Singular Values Decay",
"xaxis": {"title": "Index (k)"},
"yaxis": {"title": "Singular Value Magnitude", "type": "log"}, # Log scale often helps visualization
"height": 350,
"margin": {"l": 50, "r": 20, "t": 50, "b": 40}
}
}
# Display the chart (in a real web environment, this JSON would be rendered)
import json
print("\nPlotly chart JSON for singular values:")
print(json.dumps(plotly_singular_values_chart))
Singular values for the generated matrix, plotted on a log scale. Notice the sharp drop after the first few values.
The plot clearly shows that the singular values decrease rapidly. The first few values are significantly larger than the rest. For image compression, you would:
Trying different values of k would show a trade-off: smaller k gives more compression but a less accurate image reconstruction, while larger k improves visual quality at the cost of less compression.
This hands-on exercise demonstrates how to compute SVD using NumPy and how truncated SVD can be used to create low-rank approximations of matrices, a technique fundamental to dimensionality reduction and data compression applications in machine learning.
© 2025 ApX Machine Learning