A system of linear equations is a collection of relationships between several unknown quantities. For instance, you might have two unknowns, x1 and x2, linked by two equations:
2x1+3x2=8 4x1−x2=2Solving this system means finding a value for x1 and a value for x2 that make both equations true at the same time. While you may have solved small systems like this by hand using substitution or elimination, this approach does not scale well. Imagine a system with hundreds of equations and variables, a common scenario in machine learning. We need a more systematic and computationally friendly method.
This is where linear algebra provides a powerful way to organize the problem. We can rewrite any system of linear equations into the compact and elegant form Ax=b.
Let's look at our example again and separate it into its three main components:
The core idea is to group each of these component types into its own structure: the coefficients into a matrix A, the variables into a vector x, and the constants into a vector b.
1. The Coefficient Matrix A
We create a matrix A by arranging the coefficients in the same layout as they appear in the equations. Each row in the matrix corresponds to an equation, and each column corresponds to a variable. For our system, the coefficient matrix A is:
A=[243−1]The first row [2 3] contains the coefficients from the first equation (2x1+3x2=8). The second row [4 -1] contains the coefficients from the second equation (4x1−x2=2).
2. The Variable Vector x
Next, we group our unknown variables into a column vector x. The order must match the order of the columns in matrix A. Since our first column in A corresponds to x1 and the second to x2, our vector x is:
x=[x1x2]3. The Constant Vector b
Finally, we collect the constants from the right-hand side of the equations into another column vector, b:
b=[82]The following diagram shows how the system of equations is translated into these three distinct parts.
The components of a system of equations are organized into a coefficient matrix A, a variable vector x, and a constant vector b.
Now we have our matrix equation:
[243−1][x1x2]=[82]But how do we know this is the same as our original system? We can verify it by performing the matrix-vector multiplication on the left side, as we learned in Chapter 3. Remember, we calculate the dot product of each row of the matrix A with the column vector x:
This multiplication results in a new vector:
[2x1+3x24x1−x2]And since we state that Ax=b, we are saying:
[2x1+3x24x1−x2]=[82]For two vectors to be equal, their corresponding elements must be equal. This gives us back our original two equations: 2x1+3x2=8 and 4x1−x2=2. This confirms that the matrix form Ax=b is a perfectly valid and compact representation of our original system.
Translating a system of equations into the form Ax=b is more than just a notational trick. It provides several significant advantages, especially in the context of computing and machine learning.
Conciseness and Scalability: A system with hundreds of equations and variables can be written down just as simply as Ax=b. The underlying matrix A and vectors x and b would be much larger, but the representation remains clean. This allows us to think about the problem at a higher level of abstraction.
A Path to the Solution: This form suggests a way to solve for x. If A, x, and b were just numbers, you would solve for x by dividing b by A. In linear algebra, the equivalent operation is multiplying by the matrix inverse, which we will explore in the next sections.
Computational Efficiency: Modern numerical computing libraries like NumPy are highly optimized for matrix and vector operations. Representing problems in this form allows us to use fast, pre-built functions to find solutions, rather than writing slow, manual loops. Solving Ax = b is a standard operation in these libraries.
By converting systems of equations into this universal matrix form, we can apply the full power of linear algebra and computation to find solutions efficiently. In the following sections, we will learn about the tools needed to solve for the vector x.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with