Matrix multiplication has several important characteristics. Unlike the multiplication of regular numbers (scalars), matrix multiplication exhibits unique behaviors that are especially relevant for machine learning formulas.
This is perhaps the most significant difference from scalar multiplication. For scalars and , we know that . However, for matrices and , it is generally not true that .
There are two main reasons for this:
Let's illustrate with a NumPy example:
import numpy as np
A = np.array([[1, 2],
[3, 4]])
B = np.array([[5, 6],
[7, 8]])
# Calculate AB
AB = np.dot(A, B)
print("Matrix A:\n", A)
print("\nMatrix B:\n", B)
print("\nProduct AB:\n", AB)
# Calculate BA
BA = np.dot(B, A)
print("\nProduct BA:\n", BA)
# Check if they are equal
are_equal = np.array_equal(AB, BA)
print("\nAre AB and BA equal?", are_equal)
Executing this code will output:
Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
Product AB:
[[19 22]
[43 50]]
Product BA:
[[23 34]
[31 46]]
Are AB and BA equal? False
As you can see, even for these simple 2x2 matrices, and produce different results. This non-commutativity is fundamental. Always pay attention to the order of multiplication in matrix equations.
While the order matters for which matrices are multiplied, the grouping does not, provided the order is maintained. For compatible matrices , , and , the associative property holds:
This means you can multiply and first, then multiply the result by , or you can multiply and first, and then multiply by the result. The final matrix will be the same. This property is heavily utilized in computations and mathematical proofs, as it allows flexibility in how sequences of matrix multiplications are performed.
Let's verify this with NumPy:
import numpy as np
A = np.array([[1, 2]]) # Shape (1, 2)
B = np.array([[3, 4],
[5, 6]]) # Shape (2, 2)
C = np.array([[7],
[8]]) # Shape (2, 1)
# Calculate (AB)C
AB = np.dot(A, B) # Shape (1, 2)
ABC1 = np.dot(AB, C) # Shape (1, 1)
print("Matrix A:\n", A)
print("\nMatrix B:\n", B)
print("\nMatrix C:\n", C)
print("\n(AB)C:\n", ABC1)
# Calculate A(BC)
BC = np.dot(B, C) # Shape (2, 1)
ABC2 = np.dot(A, BC) # Shape (1, 1)
print("\nA(BC):\n", ABC2)
# Check if they are equal
are_equal = np.array_equal(ABC1, ABC2)
print("\nAre (AB)C and A(BC) equal?", are_equal)
The output confirms the results are identical:
Matrix A:
[[1 2]]
Matrix B:
[[3 4]
[5 6]]
Matrix C:
[[7]
[8]]
(AB)C:
[[131]]
A(BC):
[[131]]
Are (AB)C and A(BC) equal? True
Matrix multiplication distributes over matrix addition and subtraction, similar to scalar multiplication. For compatible matrices , , and :
Remember that matrix addition requires matrices to have the same dimensions. This property is essential for expanding and simplifying matrix expressions in derivations.
Just as the number 1 is the identity element for scalar multiplication (), there is an identity matrix, denoted as , for matrix multiplication. The identity matrix is a square matrix with 1s on the main diagonal and 0s everywhere else.
For any matrix with dimensions , multiplying by the appropriate identity matrix leaves unchanged:
Where is the identity matrix and is the identity matrix.
import numpy as np
A = np.array([[1, 2, 3],
[4, 5, 6]]) # Shape (2, 3)
# Identity matrix for right multiplication (3x3)
I_n = np.identity(3)
print("Matrix A:\n", A)
print("\nIdentity I_n (3x3):\n", I_n)
print("\nA * I_n:\n", np.dot(A, I_n))
# Identity matrix for left multiplication (2x2)
I_m = np.identity(2)
print("\nIdentity I_m (2x2):\n", I_m)
print("\nI_m * A:\n", np.dot(I_m, A))
The output will show that both A * I_n and I_m * A result in the original matrix A.
Similar to scalar multiplication where , multiplying any matrix by a compatible zero matrix (a matrix containing only zeros, denoted as ) results in a zero matrix:
The dimensions of the resulting zero matrix depend on the dimensions of and the zero matrix used.
An important property relates matrix multiplication and the transpose operation. For compatible matrices and :
Notice the reversal of the order when the transpose is applied to the product. This property frequently appears when working with optimization problems and derivatives in machine learning.
Scalar multiplication interacts predictably with matrix multiplication. For a scalar and compatible matrices and :
You can multiply the scalar with either matrix before performing the matrix multiplication, or multiply the result of the matrix multiplication by the scalar. The outcome is the same.
Understanding these properties is essential not just for performing calculations correctly, but also for manipulating matrix equations encountered in machine learning algorithms, such as linear regression, principal component analysis (PCA), and neural networks. They form the grammatical rules for the language of linear algebra.
Was this section helpful?
numpy.dot function, demonstrating matrix multiplication implementation in Python.© 2026 ApX Machine LearningEngineered with