In machine learning, we constantly work with data. Think about a simple spreadsheet or a database table. Each row might represent a specific item, like a customer, a house, or an image, while each column represents a characteristic or 'feature' of that item, such as age, size, price, or pixel intensity. Linear algebra provides a powerful and efficient way to represent and manipulate this kind of structured data using vectors and matrices.
Let's start with a single data point. Imagine you're collecting data about houses to predict their prices. For one specific house, you might record:
We can represent this single house as an ordered list of numbers, which is precisely what a vector is in linear algebra. If the house has 1500 sq ft, 3 bedrooms, is 10 years old, and is 0.5 miles from a school, we can represent it as the vector x:
x=15003100.5Here, x is a column vector. Each number in the vector corresponds to one of the features we measured. The order matters; the first element always represents size, the second always represents the number of bedrooms, and so on. This vector x now encapsulates all the measured information about that specific house in a single mathematical object.
In machine learning terminology:
We could also think of a single feature across all houses as a vector. For example, if we had 5 houses in our dataset, the sizes of all 5 houses could form a vector:
Sizes=15002100120018002400This vector represents the 'size' feature column from our data table.
Now, what if we have data for multiple houses? Let's say we have data for three houses:
We can represent each house as a vector, as shown before:
x(1)=15003100.5,x(2)=2100451.2,x(3)=12002200.8(Note: We use x(i) to denote the vector for the i-th data sample to avoid confusion with xj which often denotes the j-th feature within a vector).
To represent the entire dataset, we can stack these individual vectors together as columns (or more commonly, as rows) in a grid. This rectangular grid of numbers is called a matrix. If we arrange each house vector as a row in our matrix (a common convention in machine learning), our dataset becomes:
X=150021001200342105200.51.20.8This matrix X neatly organizes our entire dataset.
In this matrix:
This matrix representation is fundamental. It allows us to apply mathematical operations (which we'll cover in later chapters) to the entire dataset simultaneously. Operations like calculating averages for each feature, transforming features, or feeding the data into a machine learning model often rely on this matrix structure. Libraries like NumPy in Python are specifically designed to work efficiently with these vector and matrix representations of data.
© 2025 ApX Machine Learning