We've seen how matrices can organize data, like holding multiple feature vectors as rows. We also explored how multiplying a matrix by a vector transforms that vector. Now, let's combine these ideas to see how matrices provide a powerful and concise way to represent systems of linear equations. This representation is fundamental in many areas, including solving for parameters in machine learning models.
Recall that a system of linear equations consists of multiple equations that share the same set of variables. For example, consider finding two unknown values, x1 and x2, that satisfy the following two conditions simultaneously:
{2x1+3x2=71x1−1x2=1This is a simple system with two equations and two variables. As systems grow larger, with more equations and variables, writing them out this way becomes cumbersome.
Linear algebra offers a much neater way to express such systems using matrix multiplication. We can separate the system into three distinct components:
The Coefficients: The numbers multiplying the variables. We arrange these into a matrix, typically denoted as A. Each row in the matrix corresponds to an equation, and each column corresponds to a variable. For our example: A=[213−1]
The Variables: The unknown values we want to solve for. We arrange these into a column vector, typically denoted as x. For our example: x=[x1x2]
The Constants: The values on the right-hand side of the equations. We arrange these into another column vector, typically denoted as b. For our example: b=[71]
Now, we can represent the entire system of equations using a single matrix equation:
Ax=b
Substituting our matrices and vectors:
[213−1][x1x2]=[71]
Let's perform the matrix multiplication Ax according to the rules we learned earlier (row-by-column dot products):
The first element of the resulting vector is the dot product of the first row of A and the vector x: (2×x1)+(3×x2). The second element is the dot product of the second row of A and the vector x: (1×x1)+(−1×x2).
So, the matrix multiplication yields:
[2x1+3x21x1−1x2]
Setting this equal to the vector b, we get:
[2x1+3x21x1−1x2]=[71]
For two vectors to be equal, their corresponding elements must be equal. This gives us back our original system of equations:
{2x1+3x2=71x1−1x2=1This confirms that the matrix equation Ax=b is indeed a compact representation of the original system.
Representing systems of linear equations in the Ax=b format offers several advantages:
In machine learning, systems of linear equations frequently appear when fitting models to data. For instance, finding the optimal weights for a linear regression model often involves solving an equation in the Ax=b form, where A relates to the input features, x represents the model weights we want to find, and b relates to the target values.
This section focused on representing systems of linear equations using matrices. In the next chapter, we will explore methods for actually solving the equation Ax=b to find the unknown vector x.
© 2025 ApX Machine Learning