Matrix-vector multiplication is an operation that fundamentally transforms a vector. Think of a matrix as a function or a machine: you feed it a vector, and it outputs a new vector, possibly with a different length and pointing in a new direction. This operation is at the heart of many machine learning algorithms, from running a neural network layer to performing geometric transformations on data.
To multiply a matrix A by a vector v, there is one important rule of compatibility: the number of columns in the matrix must equal the number of elements in the vector.
If you have an m×n matrix (m rows, n columns) and an n×1 vector (n rows, 1 column), the result will be a new m×1 vector.
Each element in the resulting vector is calculated by taking the dot product of a row from the matrix and the input vector. Let's see this with an example.
Suppose we have a 2×3 matrix A and a 3×1 vector v:
A=[301245],v=261To find the first element of our new vector, we take the dot product of the first row of A and the vector v: (3⋅2)+(1⋅6)+(4⋅1)=6+6+4=16
To find the second element, we take the dot product of the second row of A and the vector v: (0⋅2)+(2⋅6)+(5⋅1)=0+12+5=17
So, the resulting vector is:
Av=[1617]Notice how we transformed a 3-dimensional vector into a 2-dimensional one. This ability to change the dimensionality of data is a direct consequence of matrix-vector multiplication.
There's another, more insightful way to look at the same operation. The resulting vector is actually a linear combination of the columns of the matrix, where the elements of the input vector act as the weights.
Using the same matrix A and vector v:
A=[301245],v=261We can rewrite the multiplication as:
Av=2[30]+6[12]+1[45]Let's calculate this:
Av=[60]+[612]+[45]=[6+6+40+12+5]=[1617]We get the exact same result. This perspective is powerful because it tells us that the matrix multiplication Av is exploring a point within the space defined by the columns of A. In machine learning, this often translates to combining features (the columns) according to some input weights (the vector).
One of the most intuitive ways to understand matrix-vector multiplication is to see it as a geometric transformation. A matrix can rotate, scale, or shear a vector in space.
Let's take a simple 2D vector and a matrix that performs a "shear" transformation. A shear transformation slants the space, making squares into parallelograms.
Our vector is v=[23] and our shear matrix is M=[100.51].
Let's compute the product Mv:
Mv=[100.51][23]=[(1⋅2)+(0.5⋅3)(0⋅2)+(1⋅3)]=[2+1.50+3]=[3.53]The matrix M has transformed our original vector, pushing its head to the right while keeping its y-coordinate the same.
The vector v (blue) is transformed by matrix M into the vector Mv (red). The shear transformation has shifted the vector horizontally.
NumPy makes matrix-vector multiplication straightforward. The modern and recommended operator for this is @ (the matrix multiplication operator).
Let's perform the same calculation from our first example in Python.
import numpy as np
# Define the 2x3 matrix A
A = np.array([
[3, 1, 4],
[0, 2, 5]
])
# Define the 3x1 vector v
v = np.array([2, 6, 1])
# Perform matrix-vector multiplication
result = A @ v
print(f"Matrix A:\n{A}")
print(f"\nVector v:\n{v}")
print(f"\nResult of A @ v:\n{result}")
print(f"\nShape of the result: {result.shape}")
Running this code will produce the following output:
Matrix A:
[[3 1 4]
[0 2 5]]
Vector v:
[2 6 1]
Result of A @ v:
[16 17]
Shape of the result: (2,)
This matches our manual calculation perfectly. NumPy handles the dot products for each row automatically. You might also see np.dot(A, v) used in older codebases, which achieves the same result for this operation. However, using @ often makes the code more readable, as it is designated specifically for matrix multiplication.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with