In the previous sections, we explored how vectors can be combined (linear combinations) and how a set of vectors can generate a whole space or subspace (span). We also examined when vectors within a set are redundant (linear dependence) or when each contributes unique directional information (linear independence). Now, we combine these ideas to define the most efficient way to describe a vector space: using a basis.
Think of a basis as the fundamental building blocks or coordinate system for a vector space. It's the smallest set of directions you need to be able to reach any point in that space, without any of those directions being expressible using the others.
What is a Basis?
Formally, a set of vectors B={v1,v2,…,vn} in a vector space V is called a basis for V if it satisfies two conditions:
- Linear Independence: The vectors in B are linearly independent. This means no vector in the set can be written as a linear combination of the other vectors in the set. Each vector provides unique directional information. The only solution to c1v1+c2v2+⋯+cnvn=0 is c1=c2=⋯=cn=0.
- Spanning Property: The set B spans the vector space V. This means that every vector w in V can be expressed as a linear combination of the vectors in B. That is, for any w∈V, there exist scalars c1,c2,…,cn such that w=c1v1+c2v2+⋯+cnvn.
If a set of vectors spans the space but isn't linearly independent, it contains redundant vectors. If it's linearly independent but doesn't span the space, it can't describe all the vectors within that space. A basis strikes the perfect balance.
Example: The Standard Basis
The most familiar example is the standard basis for Rn.
- For R2, the standard basis is {e1,e2}={[1,0]T,[0,1]T}. These two vectors are clearly linearly independent (neither is a multiple of the other), and any vector [x,y]T can be written as xe1+ye2.
- For R3, the standard basis is {e1,e2,e3}={[1,0,0]T,[0,1,0]T,[0,0,1]T}. Again, these are linearly independent, and any vector [x,y,z]T can be written as xe1+ye2+ze3.
The standard basis vectors e1 (blue) and e2 (red) in R2. They are orthogonal, have unit length, and span the entire 2D plane.
Non-Standard Bases
A vector space can have many different bases. For example, in R2, the set B′={[1,1]T,[1,−1]T} is also a basis.
- Linear Independence: Set c1[1,1]T+c2[1,−1]T=[0,0]T. This gives the system c1+c2=0 and c1−c2=0. The only solution is c1=0,c2=0. So, they are linearly independent.
- Spanning: We need to show that any vector [x,y]T can be written as c1[1,1]T+c2[1,−1]T. This leads to the system c1+c2=x and c1−c2=y. Solving this yields c1=(x+y)/2 and c2=(x−y)/2. Since we can find c1,c2 for any x,y, the set spans R2.
Since both conditions hold, B′ is a valid basis for R2.
Uniqueness of Representation
A significant property resulting from the definition of a basis is that every vector in the space V can be expressed as a unique linear combination of the basis vectors. If a representation wasn't unique, it would imply that there's more than one way to combine the basis vectors to get the same target vector, which would contradict the linear independence requirement. This unique set of coefficients (c1,c2,…,cn) associated with a basis B forms the coordinates of the vector w with respect to that basis.
Dimension
One of the fundamental properties of vector spaces is that all bases for a given vector space contain the same number of vectors. This consistent number is called the dimension of the vector space, often denoted as dim(V).
- The dimension of R2 is 2, because its bases (like the standard basis or B′) contain two vectors.
- The dimension of R3 is 3.
- In general, the dimension of Rn is n.
A plane through the origin in R3 is a subspace of dimension 2. A line through the origin in R3 is a subspace of dimension 1. The space containing only the zero vector {0} has dimension 0 (by definition, it has an empty basis).
If a vector space cannot be spanned by a finite set of vectors, it is called infinite-dimensional. Function spaces often fall into this category, but in most typical machine learning contexts dealing with feature vectors, we work with finite-dimensional vector spaces.
Relevance to Machine Learning
Understanding basis and dimension is important in machine learning for several reasons:
- Feature Space Interpretation: The set of all possible feature vectors for a dataset often forms a vector space (or a subset of one). The dimension of this space (or a relevant subspace) corresponds to the number of features, or more accurately, the number of independent features needed to describe the data.
- Dimensionality Reduction: Techniques like Principal Component Analysis (PCA) explicitly seek a lower-dimensional basis (a basis for a subspace) that captures the most significant variations in the original high-dimensional data. The goal is to represent the data effectively using fewer dimensions (basis vectors).
- Identifying Redundancy: If a set of feature vectors is linearly dependent, it means some features can be expressed as combinations of others. They don't contribute to expanding the span of the data in a new direction. Finding a basis for the space spanned by these feature vectors helps identify a core, non-redundant set of features or underlying factors. The dimension tells us the intrinsic number of independent factors driving the data.
- Model Complexity: The dimension of the input feature space often influences the complexity required for a machine learning model (e.g., the number of parameters in a linear model). Understanding the underlying dimension can guide model selection and regularization strategies.
In essence, basis and dimension provide a formal way to quantify the "size" and structure of feature spaces, helping us analyze data, reduce complexity, and build more efficient models. In the next sections, we'll look at specific subspaces tied to matrices and how their dimensions relate.