Now that you understand how to perform matrix multiplication, let's look at some important characteristics of this operation. Unlike the multiplication of regular numbers (scalars), matrix multiplication has some unique behaviors you need to be aware of, especially when working with machine learning formulas.
This is perhaps the most significant difference from scalar multiplication. For scalars a and b, we know that ab=ba. However, for matrices A and B, it is generally not true that AB=BA.
There are two main reasons for this:
Let's illustrate with a NumPy example:
import numpy as np
A = np.array([[1, 2],
[3, 4]])
B = np.array([[5, 6],
[7, 8]])
# Calculate AB
AB = np.dot(A, B)
print("Matrix A:\n", A)
print("\nMatrix B:\n", B)
print("\nProduct AB:\n", AB)
# Calculate BA
BA = np.dot(B, A)
print("\nProduct BA:\n", BA)
# Check if they are equal
are_equal = np.array_equal(AB, BA)
print("\nAre AB and BA equal?", are_equal)
Executing this code will output:
Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
Product AB:
[[19 22]
[43 50]]
Product BA:
[[23 34]
[31 46]]
Are AB and BA equal? False
As you can see, even for these simple 2x2 matrices, AB and BA produce different results. This non-commutativity is fundamental. Always pay attention to the order of multiplication in matrix equations.
While the order matters for which matrices are multiplied, the grouping does not, provided the order is maintained. For compatible matrices A, B, and C, the associative property holds:
(AB)C=A(BC)This means you can multiply A and B first, then multiply the result by C, or you can multiply B and C first, and then multiply A by the result. The final matrix will be the same. This property is heavily utilized in computations and mathematical proofs, as it allows flexibility in how sequences of matrix multiplications are performed.
Let's verify this with NumPy:
import numpy as np
A = np.array([[1, 2]]) # Shape (1, 2)
B = np.array([[3, 4],
[5, 6]]) # Shape (2, 2)
C = np.array([[7],
[8]]) # Shape (2, 1)
# Calculate (AB)C
AB = np.dot(A, B) # Shape (1, 2)
ABC1 = np.dot(AB, C) # Shape (1, 1)
print("Matrix A:\n", A)
print("\nMatrix B:\n", B)
print("\nMatrix C:\n", C)
print("\n(AB)C:\n", ABC1)
# Calculate A(BC)
BC = np.dot(B, C) # Shape (2, 1)
ABC2 = np.dot(A, BC) # Shape (1, 1)
print("\nA(BC):\n", ABC2)
# Check if they are equal
are_equal = np.array_equal(ABC1, ABC2)
print("\nAre (AB)C and A(BC) equal?", are_equal)
The output confirms the results are identical:
Matrix A:
[[1 2]]
Matrix B:
[[3 4]
[5 6]]
Matrix C:
[[7]
[8]]
(AB)C:
[[131]]
A(BC):
[[131]]
Are (AB)C and A(BC) equal? True
Matrix multiplication distributes over matrix addition and subtraction, similar to scalar multiplication. For compatible matrices A, B, and C:
Remember that matrix addition requires matrices to have the same dimensions. This property is essential for expanding and simplifying matrix expressions in derivations.
Just as the number 1 is the identity element for scalar multiplication (a×1=1×a=a), there is an identity matrix, denoted as I, for matrix multiplication. The identity matrix I is a square matrix with 1s on the main diagonal and 0s everywhere else.
For any matrix A with dimensions m×n, multiplying by the appropriate identity matrix leaves A unchanged:
AIn=A ImA=AWhere In is the n×n identity matrix and Im is the m×m identity matrix.
import numpy as np
A = np.array([[1, 2, 3],
[4, 5, 6]]) # Shape (2, 3)
# Identity matrix for right multiplication (3x3)
I_n = np.identity(3)
print("Matrix A:\n", A)
print("\nIdentity I_n (3x3):\n", I_n)
print("\nA * I_n:\n", np.dot(A, I_n))
# Identity matrix for left multiplication (2x2)
I_m = np.identity(2)
print("\nIdentity I_m (2x2):\n", I_m)
print("\nI_m * A:\n", np.dot(I_m, A))
The output will show that both A * I_n
and I_m * A
result in the original matrix A
.
Similar to scalar multiplication where a×0=0, multiplying any matrix A by a compatible zero matrix (a matrix containing only zeros, denoted as 0) results in a zero matrix:
A0=0 0A=0The dimensions of the resulting zero matrix depend on the dimensions of A and the zero matrix used.
An important property relates matrix multiplication and the transpose operation. For compatible matrices A and B:
(AB)T=BTATNotice the reversal of the order when the transpose is applied to the product. This property frequently appears when working with optimization problems and derivatives in machine learning.
Scalar multiplication interacts predictably with matrix multiplication. For a scalar c and compatible matrices A and B:
c(AB)=(cA)B=A(cB)You can multiply the scalar with either matrix before performing the matrix multiplication, or multiply the result of the matrix multiplication by the scalar. The outcome is the same.
Understanding these properties is essential not just for performing calculations correctly, but also for manipulating matrix equations encountered in machine learning algorithms, such as linear regression, principal component analysis (PCA), and neural networks. They form the grammatical rules for the language of linear algebra.
© 2025 ApX Machine Learning