In machine learning, we work with data. Lots of data. This data could be anything from customer purchase histories and sensor readings to images and text documents. To apply mathematical algorithms to this diverse data, we first need a consistent way to represent it. This is where vectors come in.
Think of a vector as an ordered list of numbers. Each number in this list represents a specific characteristic or feature of a single data point. By organizing these features into a vector, we create a mathematical object that algorithms can process.
Imagine you have data about houses, and you want to predict their prices. For each house, you might record several features:
A specific house, say House A, might have the following characteristics: 1500 sq ft, 3 bedrooms, 20 years old, and 0.5 miles from a school. We can represent House A as a vector:
vA=15003200.5
This column vector vA is often called a feature vector. Each element corresponds to a specific feature in a predefined order. Another house, House B (2100 sq ft, 4 bedrooms, 5 years old, 1.2 miles from school), would be represented by a different vector:
vB=2100451.2
This representation is incredibly versatile:
Representing data as vectors allows us to think geometrically. If a data point has two features (like height and weight), we can represent it as a 2-dimensional vector. We can visualize this vector as an arrow starting from the origin (0, 0) and ending at the point defined by the feature values (height, weight) in a 2D plane.
Two data points, A=(2, 5) and B=(4.5, 3), represented as vectors originating from the origin in a 2-dimensional feature space.
If we have three features, each data point becomes a vector in 3D space. For data with n features (like our house example with n=4), each data point corresponds to a vector in an n-dimensional space, often called the feature space. While we can't easily visualize spaces beyond three dimensions, the mathematical principles remain the same.
This vector representation is the foundation upon which most machine learning algorithms are built. It provides a structured format that allows us:
In the following sections, we will explore the fundamental operations you can perform on these vectors and how these operations are relevant to machine learning tasks.
© 2025 ApX Machine Learning