Okay, we've seen how vectors can represent individual data points or features (like pixel intensities, user ratings, or sensor measurements) and how matrices can represent entire datasets or transformations. But what are the fundamental rules governing how these vectors behave together? Where do these vectors "live" mathematically? This leads us to the concept of a Vector Space.
Think of a vector space as a collection of objects, which we call vectors, along with two fundamental operations: vector addition and scalar multiplication. For a collection to qualify as a vector space, these operations must satisfy a specific set of rules, often called axioms. These rules ensure that the operations behave in a consistent and predictable way, much like the rules of arithmetic govern numbers.
Why is this formal definition important? Because it provides a solid mathematical foundation. When we know our data vectors live in a vector space, we automatically know they obey these reliable rules. This allows us to build powerful and general techniques for data analysis and machine learning that work consistently across different types of vector data.
Let V be a set of vectors, and let u,v,w be any vectors in V. Let c,d be any scalars (usually real numbers in machine learning contexts). For V to be a vector space, the following ten axioms must hold true:
Properties of Vector Addition:
Closure under Addition: If u and v are in V, then their sum u+v must also be in V.
Commutativity of Addition: u+v=v+u.
Associativity of Addition: (u+v)+w=u+(v+w).
Existence of a Zero Vector: There exists a unique vector 0 in V, called the zero vector, such that u+0=u for all u in V.
Existence of Additive Inverses: For every vector u in V, there exists a unique vector −u in V, called the additive inverse, such that u+(−u)=0.
Properties of Scalar Multiplication:
Closure under Scalar Multiplication: If u is in V and c is a scalar, then the product cu must also be in V.
Associativity of Scalar Multiplication: (cd)u=c(du).
Properties Connecting Addition and Scalar Multiplication (Distributive Laws):
Distributivity over Vector Sum: c(u+v)=cu+cv.
Distributivity over Scalar Sum: (c+d)u=cu+du.
Scalar Multiplication Identity: 1u=u, where 1 is the multiplicative identity scalar.
The most common vector space you'll encounter in machine learning is Rn, the space of all n-dimensional vectors with real number components.
For example, consider R2, the space of all vectors of the form [x,y], where x and y are real numbers. This corresponds to the familiar 2D Cartesian plane. Let's quickly check a couple of axioms:
You can verify that R2, and indeed Rn for any positive integer n, satisfies all ten axioms using the standard definitions of vector addition and scalar multiplication you learned in Chapter 1.
When we represent data points as feature vectors in Rn, we are implicitly working within a vector space. This means we can rely on these properties when manipulating data, building models (like linear regression, which heavily relies on vector addition and scalar multiplication), or analyzing relationships between data points.
Understanding these formal properties is the first step towards exploring more complex concepts like subspaces, linear independence, basis, and dimension, which are essential for analyzing the structure within datasets and the behavior of machine learning algorithms.
© 2025 ApX Machine Learning