Okay, let's expand on the concept introduced in the chapter overview. We've seen that vectors are ordered lists of numbers. Now, imagine organizing numbers not just in a list, but in a rectangular grid, like a spreadsheet or a checkerboard. That's essentially what a matrix is.
A matrix (plural: matrices) is a rectangular array or table of numbers, symbols, or expressions, arranged in rows (horizontal lines) and columns (vertical lines). They are fundamental tools in linear algebra and are used extensively in machine learning for organizing data, representing transformations, and much more.
Think of a matrix as a collection of numbers organized neatly. For example, consider this matrix:
A=1−350427−20This matrix has:
[1 0 7]
[-3 4 -2]
[5 2 0]
[1 -3 5]
ᵀ (The T means transpose, often used to write column vectors horizontally)[0 4 2]
ᵀ[7 -2 0]
ᵀThe dimensions or size of a matrix are given by the number of rows and the number of columns, typically written as "rows × columns". The matrix A above is a 3×3 matrix (read as "three by three").
Here's another example:
B=2.11.004.2−0.53.71.10This matrix B has 4 rows and 2 columns, so it is a 4×2 matrix.
You can think of vectors (which we discussed in the previous chapter) as special cases of matrices:
This connection helps in understanding how operations involving vectors and matrices relate to each other.
Matrices provide a powerful way to represent and manipulate collections of numbers simultaneously. In machine learning, they are constantly used:
The structured nature of matrices allows us to define consistent operations (like addition and multiplication, which we'll cover soon) that apply to entire blocks of data efficiently.
We often use capital letters (like A, B, C) to denote matrices. To refer to a specific element within a matrix, we use lowercase letters with two subscripts: aij. The first subscript i indicates the row number, and the second subscript j indicates the column number.
So, for a general m×n matrix A (meaning m rows and n columns), we can write it as:
A=a11a21⋮am1a12a22⋮am2⋯⋯⋱⋯a1na2n⋮amnHere, a21 is the element in the 2nd row and 1st column. amn is the element in the last row (m) and last column (n).
A general m×n matrix A, showing the element aij located at the i-th row and j-th column.
Important Note on Indexing: In mathematics, matrix indices typically start from 1 (so the top-left element is a11). However, in many programming languages, including Python (and its library NumPy which we'll use extensively), indexing starts from 0. So, the element a11 in math corresponds to A[0, 0]
in NumPy, and aij corresponds to A[i-1, j-1]
. This is a common point of confusion, so keep it in mind when translating mathematical concepts into code.
Now that we understand what a matrix is and how it's structured, we'll look at how to denote matrix dimensions more formally and explore some important special types of matrices.
© 2025 ApX Machine Learning