In the previous section, we defined the overarching structure of a vector space. Now, let's consider smaller, self-contained structures within a given vector space. Imagine the familiar 3D space, R3. Inside this space, you can visualize planes and lines that pass through the origin. These aren't just arbitrary collections of points; they possess the same fundamental properties as the larger space they inhabit. These special subsets are called subspaces.
Formally, a subset W of a vector space V is called a subspace if W itself forms a vector space under the same addition and scalar multiplication operations defined on V. While checking all vector space axioms might seem tedious, there's a simpler test. A non-empty subset W of V is a subspace if and only if it satisfies three conditions:
- Contains the Zero Vector: The zero vector of V must also be in W. (0∈W)
- Closure under Addition: For any two vectors u and v in W, their sum u+v must also be in W.
- Closure under Scalar Multiplication: For any vector u in W and any scalar c, the scalar multiple c⋅u must also be in W.
Let's unpack these conditions with some geometric intuition.
Geometric Viewpoint: Lines and Planes Through the Origin
Think about R2, the standard 2D Cartesian plane.
- A line passing through the origin: Consider the set W of all vectors (x,y) such that y=2x.
- Does it contain the zero vector? Yes, (0,0) satisfies 0=2×0.
- Is it closed under addition? Let u=(x1,2x1) and v=(x2,2x2) be two vectors on the line. Their sum is u+v=(x1+x2,2x1+2x2)=(x1+x2,2(x1+x2)). This resulting vector also has its y-component equal to twice its x-component, so it lies on the line. W is closed under addition.
- Is it closed under scalar multiplication? Let u=(x1,2x1) be on the line and c be any scalar. Then c⋅u=(cx1,c(2x1))=(cx1,2(cx1)). This vector also satisfies the condition y=2x, so it's on the line. W is closed under scalar multiplication.
Since all three conditions hold, the line y=2x is a subspace of R2.
A line passing through the origin in R2 satisfies the subspace conditions. It contains the origin, and adding any two vectors on the line or scaling a vector on the line results in another vector on the same line.
-
What about a line not passing through the origin? For example, the set W′ of vectors (x,y) such that y=2x+1. This set fails the first condition: the zero vector (0,0) is not in W′ because 0=2(0)+1. Therefore, this line is not a subspace of R2. The requirement that the zero vector must belong to any subspace is fundamental.
-
The entire space R2 is also a subspace of itself. It trivially satisfies all conditions.
-
The set containing only the zero vector {0} is also a subspace. This is often called the trivial subspace.
Similarly, in R3, lines and planes passing through the origin (0,0,0) are subspaces. A plane defined by ax+by+cz=0 is a subspace, while a plane ax+by+cz=d (where d=0) is not, because it doesn't contain the origin.
Algebraic Example
Consider the vector space V=R3. Let W be the set of all vectors of the form (x,y,0), where x and y are any real numbers. Geometrically, this represents the xy-plane within the 3D space. Let's rigorously check the conditions:
- Zero Vector: The zero vector in R3 is (0,0,0). This fits the form (x,y,0) with x=0 and y=0. So, the zero vector is in W.
- Closure under Addition: Let u=(x1,y1,0) and v=(x2,y2,0) be two arbitrary vectors in W. Their sum is u+v=(x1+x2,y1+y2,0+0)=(x1+x2,y1+y2,0). The result is still a vector with a zero in the third component, confirming that u+v is also in W.
- Closure under Scalar Multiplication: Let u=(x1,y1,0) be in W and c be any scalar (a real number). Then c⋅u=(cx1,cy1,c⋅0)=(cx1,cy1,0). This resulting vector also has a zero in the third component, so c⋅u is in W.
Since all three conditions are met, the set W (the xy-plane) is indeed a subspace of R3.
Why Subspaces Matter in Machine Learning
Understanding subspaces is not just a theoretical exercise; it provides valuable insights into data structure and algorithm behavior:
- Feature Subspaces: Imagine your dataset has many features (e.g., hundreds of measurements for each data point), meaning your data vectors live in a high-dimensional space Rn. If you perform feature selection or feature engineering, you might choose to work with a smaller set of features. The data, when represented using only this subset of features, effectively resides within a subspace of the original feature space. The properties of this subspace, such as its dimension (which we'll discuss soon), relate directly to the complexity and redundancy of the selected features. For instance, Principal Component Analysis (PCA) explicitly seeks a lower-dimensional subspace that captures most of the variance in the data.
- Solution Spaces: When solving systems of linear equations, particularly homogeneous systems of the form Ax=0, the set of all possible solution vectors x forms a subspace. This is known as the null space (or kernel) of the matrix A. Understanding this subspace is important for analyzing the uniqueness of solutions to linear systems that arise in model fitting, like in linear regression.
- Manifold Hypothesis: While often dealing with non-linear structures, the manifold hypothesis in machine learning suggests that complex, high-dimensional data (like images or natural language embeddings) often lies concentrated near a lower-dimensional manifold embedded within the higher-dimensional ambient space. Although manifolds are generally curved, their local structure can sometimes be approximated by flat subspaces (tangent spaces). Subspaces provide the essential linear algebraic foundation for thinking about data that exhibits lower-dimensional structure, even if that structure isn't perfectly flat globally.
Identifying whether a given set of vectors forms a subspace involves verifying the three closure properties. If any one property fails (e.g., the set doesn't contain the zero vector, or adding two vectors in the set gives a vector outside the set), then it is not a subspace. This systematic check helps us categorize and analyze structured subsets within larger vector spaces, which is fundamental for understanding data transformations, dimensionality reduction techniques, and the behavior of various machine learning models.