Matrices as Linear Transformations

Matrices can perform a dynamic role as functions that transform space. Thinking of a matrix as an operation or a transformation is fundamental to understanding many of its applications in machine learning.

When we perform matrix-vector multiplication, such as $Ax = b$ , we can interpret it as the matrix $A$ acting on the vector $x$ to produce a new vector $b$ . The matrix $A$ is a function that takes a vector as an input and maps it to a new vector as an output.

A matrix acts as a function, taking an input vector v and transforming it into an output vector Av.

This transformation isn't random. It's a linear transformation, which has two important properties: the origin $(0,0)$ remains fixed, and grid lines remain parallel and evenly spaced. Essentially, the matrix can stretch, shrink, rotate, or shear the entire coordinate space, but it won't curve or warp it.

How a Matrix Defines a Transformation

To understand what a transformation does, we only need to track what happens to the basis vectors. In a standard two-dimensional plane, the basis vectors are $\hat{i}$ , a unit vector along the x-axis, and $\hat{j}$ , a unit vector along the y-axis.

\hat{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \quad \hat{j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}

The columns of any 2x2 matrix tell us exactly where these basis vectors land after the transformation. For a matrix $A$ :

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

The first column, $\begin{bmatrix} a \\ c \end{bmatrix}$ , is the new position of $\hat{i}$ . The second column, $\begin{bmatrix} b \\ d \end{bmatrix}$ , is the new position of $\hat{j}$ . Let's look at a few examples to make this clear.

Example 1: Scaling

A scaling transformation stretches or shrinks space along the axes. For example, here's the matrix:

S = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix}

Here, the first column tells us that the basis vector $\hat{i}$ is transformed to $\begin{bmatrix} 2 \\ 0 \end{bmatrix}$ . It's been stretched to twice its original length along the x-axis. The second column shows that $\hat{j}$ is transformed to $\begin{bmatrix} 0 \\ 0.5 \end{bmatrix}$ , shrinking it to half its length along the y-axis.

Any other vector in the space is transformed accordingly. For example, let's see what happens to the vector $v = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ :

Sv = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} (2 \cdot 1) + (0 \cdot 2) \\ (0 \cdot 1) + (0.5 \cdot 2) \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}

The vector is stretched horizontally and compressed vertically, just like the underlying grid.

The vector $v$ is transformed into $Sv$ , showing a stretch along the x-axis and a compression along the y-axis.

Example 2: Rotation

A rotation transformation pivots the entire space around the origin. A matrix for a 90-degree counter-clockwise rotation is:

R = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}

Looking at the columns, we see that $\hat{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$ moves to $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$ (the original position of $\hat{j}$ ), and $\hat{j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}$ moves to $\begin{bmatrix} -1 \\ 0 \end{bmatrix}$ .

Let's apply this transformation to our vector $v = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ :

Rv = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} (0 \cdot 1) + (-1 \cdot 2) \\ (1 \cdot 1) + (0 \cdot 2) \end{bmatrix} = \begin{bmatrix} -2 \\ 1 \end{bmatrix}

The vector is rotated 90 degrees without changing its length.

The vector $v$ is rotated 90 degrees counter-clockwise by the matrix $R$ .

Example 3: Shear

A shear transformation slants the space, as if pushing one layer of a deck of cards. A horizontal shear matrix looks like this:

H = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}

Here, the basis vector $\hat{i}$ stays at $\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ , but $\hat{j}$ is transformed to $\begin{bmatrix} 1 \\ 1 \end{bmatrix}$ . This means the y-axis is tilted to the right.

Transforming our vector $v = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ :

Hv = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} (1 \cdot 1) + (1 \cdot 2) \\ (0 \cdot 1) + (1 \cdot 2) \end{bmatrix} = \begin{bmatrix} 3 \\ 2 \end{bmatrix}

The y-coordinate of the vector remains the same, but its x-coordinate is pushed to the right by an amount equal to its original y-coordinate.

Performing Transformations in NumPy

We can perform these transformations easily in Python using NumPy. Let's replicate the shear transformation.

import numpy as np

# Define the shear matrix H
H = np.array([
    [1, 1],
    [0, 1]
])

# Define the original vector v
v = np.array([1, 2])

# Apply the transformation using the @ operator for matrix multiplication
transformed_v = H @ v

print(f"Original vector: {v}")
print(f"Shear matrix H:\n{H}")
print(f"Transformed vector Hv: {transformed_v}")

Running this code will produce the expected output:

Original vector: [1 2]
Shear matrix H:
[[1 1]
 [0 1]]
Transformed vector Hv: [3 2]

This simple operation is the foundation of many complex algorithms. By viewing matrices as transformations, we can develop a much deeper feel for what is happening to our data. This perspective is what allows us to pose the central question of this chapter: during a transformation, do any vectors manage to maintain their direction, changing only in length? These special vectors, which are only scaled, are the eigenvectors we will formally define next.

Was this section helpful?

References

MIT 18.06SC Linear Algebra, Fall 2011, Gilbert Strang, 2011 (MIT OpenCourseWare) - This free online course offers video lectures that visually explain linear transformations and matrix operations, providing an excellent complement to textbook learning.