Matrix-matrix multiplication composes two transformations into one. This operation is fundamental in machine learning, especially in neural networks, where passing data through successive layers is essentially a series of matrix multiplications.
Before you can multiply two matrices, they must be compatible. This is the most important rule to remember. If you have a matrix A with dimensions m×n (meaning m rows and n columns) and a matrix B with dimensions n×p, you can multiply them to get a new matrix C with dimensions m×p.
The rule is simple: The number of columns in the first matrix must equal the number of rows in the second matrix. We call these the "inner dimensions."
Am×n⋅Bn×p=Cm×pThe "outer dimensions," m and p, determine the shape of the final matrix. If the inner dimensions do not match, the multiplication is undefined.
The number of columns in the first matrix must align with the number of rows in the second matrix for the multiplication to be valid. The resulting matrix inherits the rows of the first and the columns of the second.
The actual calculation is an extension of the matrix-vector multiplication you saw earlier. To get the element in the i-th row and j-th column of the resulting matrix C, you calculate the dot product of the i-th row of matrix A and the j-th column of matrix B.
Let's walk through an example. Suppose we want to compute the product C=AB, where:
A=(142536)andB=792813First, check the dimensions. A is a 2×3 matrix and B is a 3×2 matrix. The inner dimensions match (3 and 3), so the operation is valid. The resulting matrix C will have dimensions 2×2.
C=(C11C21C12C22)Now let's compute each element of C:
To find C11 (1st row, 1st column): Take the dot product of the 1st row of A and the 1st column of B.
C11=(1⋅7)+(2⋅9)+(3⋅2)=7+18+6=31To find C12 (1st row, 2nd column): Take the dot product of the 1st row of A and the 2nd column of B.
C12=(1⋅8)+(2⋅1)+(3⋅3)=8+2+9=19To find C21 (2nd row, 1st column): Take the dot product of the 2nd row of A and the 1st column of B.
C21=(4⋅7)+(5⋅9)+(6⋅2)=28+45+12=85To find C22 (2nd row, 2nd column): Take the dot product of the 2nd row of A and the 2nd column of B.
C22=(4⋅8)+(5⋅1)+(6⋅3)=32+5+18=55Putting it all together, our final matrix is:
C=(31851955)Matrix multiplication has some properties that are different from the multiplication of regular numbers (scalars).
For scalars, 3×5 is the same as 5×3. This is not true for matrices. In general, AB=BA. This is one of the most significant differences.
Using our previous example, let's try to compute BA:
B3×2⋅A2×3The inner dimensions (2 and 2) match, so we can perform this multiplication. The result will be a 3×3 matrix, which is a different shape than the 2×2 matrix we got from AB. Since the results have different shapes, they cannot be equal. Even if the shapes were the same, the values would likely be different.
While the order of matrices cannot be swapped, the order of operations does not matter if the matrices themselves stay in the same sequence. This is called the associative property:
(AB)C=A(BC)This is useful because it means you can group matrix multiplications in any way that is computationally efficient without changing the final result.
NumPy makes matrix multiplication straightforward. The modern and recommended way is to use the @ operator, which was introduced specifically for matrix multiplication.
import numpy as np
# Define our matrices A and B from the example
A = np.array([
[1, 2, 3],
[4, 5, 6]
])
B = np.array([
[7, 8],
[9, 1],
[2, 3]
])
# Perform matrix multiplication using the @ operator
C = A @ B
print("Matrix A (2x3):\n", A)
print("\nMatrix B (3x2):\n", B)
print("\nResult of A @ B (2x2):\n", C)
Output:
Matrix A (2x3):
[[1 2 3]
[4 5 6]]
Matrix B (3x2):
[[7 8]
[9 1]
[2 3]]
Result of A @ B (2x2):
[[31 19]
[85 55]]
The result matches our manual calculation perfectly.
You might also see the np.dot() function used for matrix multiplication. It works for both vector dot products and matrix multiplication, which can sometimes be confusing.
# Using np.dot() also works
C_dot = np.dot(A, B)
print("\nResult using np.dot(A, B):\n", C_dot)
For clarity and readability, it's best to use @ for matrix multiplication and np.dot() when you are explicitly calculating the dot product of two vectors.
Was this section helpful?
@ operator, essential for practical implementation.© 2026 ApX Machine LearningEngineered with