Matrix multiplication is one of the most fundamental operations in linear algebra, particularly when working with data transformations and machine learning models. Unlike element-wise multiplication (often called the Hadamard product), matrix multiplication has a specific definition that allows us to combine linear transformations or process data in sophisticated ways.
The core idea behind multiplying two matrices, say A and B, to get a product C=AB, is based on computing dot products between the rows of A and the columns of B.
For this multiplication to be defined, the number of columns in the first matrix (A) must exactly match the number of rows in the second matrix (B). If A is an m×n matrix (meaning m rows and n columns) and B is an n×p matrix (n rows and p columns), their product C=AB will be an m×p matrix. The "inner" dimensions (n) must match, and the "outer" dimensions (m and p) determine the shape of the result.
The element in the i-th row and j-th column of the resulting matrix C, denoted as Cij, is calculated by taking the dot product of the i-th row of A and the j-th column of B.
Mathematically, if A=[aik] and B=[bkj], then the element Cij of the product C=AB is given by: Cij=∑k=1naikbkj This means you multiply corresponding elements from the i-th row of A and the j-th column of B and then sum up those products.
Let's compute the product of two matrices, A and B:
A=[1324],B=[5768]
Here, A is 2×2 and B is 2×2. The inner dimensions match (both are 2), so the multiplication is defined. The resulting matrix C=AB will be 2×2.
Let's calculate the elements of C=[c11c21c12c22]:
So, the resulting matrix is: C=AB=[19432250]
We can visualize the calculation of a single element, like c11, as combining the first row of A and the first column of B:
Calculation of the top-left element (C₁₁) by taking the dot product of the first row of A (blue) and the first column of B (red).
Dimension Compatibility: Remember, the product AB is only defined if the number of columns of A equals the number of rows of B. If the dimensions don't align, the multiplication cannot be performed.
Non-Commutativity: Unlike multiplication of scalar numbers, matrix multiplication is generally not commutative. That is, AB=BA in most cases. Let's swap the order for our example matrices A and B:
BA=[5768][1324]
BA=[23313446] Clearly, AB=[19432250]=[23313446]=BA. The order of multiplication matters significantly.
Associativity: Matrix multiplication is associative: (AB)C=A(BC), provided the dimensions are compatible for all multiplications. This property is useful because it means we can group sequences of matrix multiplications without changing the final result.
Distributivity: Matrix multiplication distributes over matrix addition: A(B+C)=AB+AC and (A+B)C=AC+BC, again assuming compatible dimensions.
Matrix multiplication isn't just an abstract mathematical rule; it's central to many machine learning operations:
Understanding the mechanics and properties of matrix multiplication is therefore essential for comprehending how many machine learning algorithms process and transform data. In the practical sections that follow, you'll use libraries like NumPy, which implement these operations highly efficiently.
Was this section helpful?
© 2025 ApX Machine Learning