In the previous chapter, we saw how vectors serve as fundamental building blocks, often representing a single data point or a set of features for one observation in a machine learning context. For instance, a vector could hold the pixel values for one image, the word counts for one document, or the features (like square footage, number of bedrooms) for one house.
However, real-world machine learning tasks almost always involve working with collections of data points, not just one. We need a way to organize and manipulate these collections efficiently. This is where matrices come in.
A matrix is a rectangular grid of numbers, arranged in rows and columns. If a matrix has m rows and n columns, we say it has dimensions m×n. You can think of it as a generalization of a vector; a vector is essentially a matrix with only one column (a column vector) or one row (a row vector).
The most common way matrices arise in machine learning is by representing entire datasets. Imagine a typical dataset stored in a table, like a spreadsheet or a CSV file:
Size (sq ft) | Bedrooms | Price ($1000s) |
---|---|---|
1500 | 3 | 300 |
1200 | 2 | 250 |
1800 | 4 | 380 |
1350 | 3 | 290 |
We can directly translate this tabular data into a matrix. Conventionally, each row of the matrix corresponds to a single sample or data point (like one house), and each column corresponds to a specific feature or attribute (like size, number of bedrooms, or price).
For the housing data above, if we are using 'Size' and 'Bedrooms' as our input features (X) to predict 'Price' (y), we can represent the features as a matrix:
X=15001200180013503243This matrix X has dimensions 4×2 because there are 4 samples (houses) and 2 features. Each row, like [1500,3], is a feature vector for one sample. The target variable, price, would typically be stored as a separate vector:
y=300250380290This matrix X is often called the feature matrix or design matrix. It's a standard structure used in many machine learning algorithms.
This diagram shows how a dataset table, where rows are samples and columns are features, is commonly represented as a feature matrix X. Each row in the matrix is a feature vector for a single sample.
Matrices aren't limited to simple tabular data. Consider other data types common in machine learning:
Organizing data into matrices provides several significant advantages:
ndarrays
or N-dimensional arrays in NumPy). Performing an operation on a whole matrix using NumPy is drastically faster than looping through individual data points in Python. This "vectorization" is essential for performance in machine learning.By representing collections of data points as rows (or sometimes columns) within a matrix, we create a structured format that is ideal for mathematical manipulation and computational processing, paving the way for the linear transformations and system solving we'll cover next. We'll rely heavily on NumPy to create and work with these matrix representations in Python.
© 2025 ApX Machine Learning