Many problems in computation involve finding values for a set of unknown variables that simultaneously satisfy several linear relationships. Think about finding the right price for a product based on supply and demand equations, or determining the optimal allocation of resources under budget constraints. These situations often lead to systems of linear equations.
Let's consider a simple example system with two equations and two unknown variables, x1 and x2:
2x1+3x2=7 1x1−1x2=1
Our goal is to find the values of x1 and x2 that make both equations true. While we could solve this using methods like substitution or elimination from basic algebra, linear algebra provides a more systematic and scalable approach, especially when dealing with many equations and variables. The first step is to represent this system using matrices and vectors.
We can separate the system into three distinct parts:
Let's arrange these parts into matrix and vector structures.
We collect the coefficients of the variables into a matrix, which we'll call A. Each row in the matrix corresponds to one equation, and each column corresponds to one variable.
For our example system: 2x1+3x2=7 1x1−1x2=1
The coefficients are 2, 3, 1, and -1. We arrange them as:
A=[213−1]The first row [23] contains the coefficients from the first equation. The second row [1−1] contains the coefficients from the second equation. The first column [21] contains the coefficients of x1, and the second column [3−1] contains the coefficients of x2.
The unknown variables are arranged into a column vector, typically denoted as x:
x=[x1x2]The constants on the right-hand side of the equations are also arranged into a column vector, often denoted as b:
b=[71]Now, how do these pieces fit together? Recall the definition of matrix multiplication from the previous chapter. Let's multiply our coefficient matrix A by our variable vector x:
Ax=[213−1][x1x2]Performing the matrix-vector multiplication (dot product of each row of A with the column vector x):
Ax=[(2×x1)+(3×x2)(1×x1)+(−1×x2)]=[2x1+3x2x1−x2]Look closely at the resulting vector. Its first element, 2x1+3x2, is exactly the left-hand side of our first original equation. Its second element, x1−x2, is the left-hand side of our second original equation.
The original system stated that these expressions must equal the constants 7 and 1, respectively. We captured these constants in the vector b. Therefore, we can write the entire system of equations as a single matrix equation:
Ax=bSubstituting our matrices and vectors:
[213−1][x1x2]=[71]This compact form, Ax=b, perfectly represents the original system of linear equations.
This representation isn't limited to two equations. Any system of m linear equations in n unknowns:
a11x1+a12x2+⋯+a1nxn=b1 a21x1+a22x2+⋯+a2nxn=b2 ⋮ am1x1+am2x2+⋯+amnxn=bm
can be written in the matrix form Ax=b, where:
This Ax=b format is standard in linear algebra. It separates the known coefficients (A), the unknown variables (x), and the target outcomes (b). This structure is advantageous because it allows us to apply the powerful operations and concepts of matrix algebra, such as matrix inversion (which we'll discuss shortly), to analyze and solve these systems efficiently. Representing data and relationships in this form is common in various fields, including machine learning, where you might encounter it in algorithms like linear regression.
© 2025 ApX Machine Learning