While the Singular Value Decomposition (SVD) formula, A=UΣVT, might look abstract, it describes a fundamental geometric process. Understanding this geometry provides intuition for why SVD is so effective in applications like dimensionality reduction and understanding linear transformations.
Consider a linear transformation represented by an m×n matrix A. When we apply this transformation to a vector x to get y=Ax, SVD tells us this transformation can be broken down into three distinct geometric steps:
Rotation/Reflection (VT): The matrix VT (the transpose of V) is an orthogonal matrix. When applied to the input vector x, VTx performs a rotation and possibly a reflection of the input space Rn. It doesn't change the lengths of vectors or the angles between them. Think of this step as aligning the input space along a special set of orthogonal directions, given by the columns of V (the right singular vectors). These vectors v1,v2,…,vn form an orthonormal basis for the input space. VT essentially rotates the space so that these principal input directions align with the standard coordinate axes.
Scaling (Σ): The matrix Σ is an m×n rectangular diagonal matrix. Its diagonal entries are the singular values σ1,σ2,…,σr (where r is the rank of A), and all other entries are zero. This matrix scales the coordinates of the rotated vector VTx. Specifically, it scales the i-th coordinate (which corresponds to the direction of vi) by the singular value σi. Directions corresponding to zero singular values are effectively squashed to zero. This step stretches or shrinks the space along the newly aligned axes.
Rotation/Reflection (U): The matrix U is also an orthogonal matrix (m×m). This final step takes the scaled vector ΣVTx and performs another rotation and possibly a reflection in the output space Rm. The columns of U, u1,u2,…,um (the left singular vectors), form an orthonormal basis for the output space. This step rotates the scaled vectors from the axes-aligned orientation (after step 2) into their final positions in the output space, aligning them with the principal output directions defined by the columns of U.
Imagine applying the transformation A to all the points on a unit circle in 2D (or a unit sphere in 3D). Here's how SVD breaks down the transformation geometrically:
The SVD essentially tells us that any linear transformation A maps orthonormal basis vectors in the input space (columns of V) to orthogonal vectors in the output space (columns of U scaled by singular values σi). That is, Avi=σiui for i=1,…,r.
The transformation defined by matrix A=(1011) maps the unit circle (gray) to an ellipse (red). SVD decomposes this overall transformation into a sequence of rotation (VT), axis-aligned scaling (Σ), and another rotation (U).
This geometric view is particularly insightful for understanding dimensionality reduction using SVD. The singular values σi quantify the importance of each principal direction. Larger singular values correspond to directions where the data (or the transformation) has the most variance or "spread". By keeping only the components corresponding to the largest singular values, we retain the most significant geometric features of the transformation while potentially discarding dimensions associated with small singular values (which might represent noise or less important variations).
© 2025 ApX Machine Learning