We've established that eigenvalues (λ) and eigenvectors (x) satisfy the fundamental relationship Ax=λx. This equation tells us that when the matrix A transforms the vector x, the result is simply the original vector x scaled by the factor λ. The eigenvector's direction remains unchanged.
Our goal now is to find a systematic way to determine the possible values of λ for a given square matrix A. We can start by rearranging the eigenvalue equation:
Ax=λx
Subtract λx from both sides:
Ax−λx=0
Where 0 represents the zero vector. To factor out the vector x, we need to express λx as a matrix-vector product. We can do this using the identity matrix I of the same size as A. Recall that Ix=x. Therefore, λx=λIx. Substituting this into our equation gives:
Ax−λIx=0
Now we can factor out the vector x:
(A−λI)x=0
This equation is significant. It's a homogeneous system of linear equations of the form Mx=0, where the matrix M is (A−λI). We are looking for the eigenvalues λ, which are the scalars that allow this equation to have non-zero solutions for x (because eigenvectors, by definition, cannot be the zero vector).
Think back to solving systems of linear equations. A homogeneous system Mx=0 has only the trivial solution (x=0) if the matrix M is invertible. Conversely, it has non-trivial solutions (which is what we need for eigenvectors) if and only if the matrix M is singular (not invertible).
Applying this to our eigenvalue problem, the matrix M=(A−λI) must be singular for non-zero eigenvectors x to exist.
How do we determine if a matrix is singular? A fundamental property of matrices is that a square matrix is singular if and only if its determinant is zero.
Therefore, to find the eigenvalues λ, we must find the values of λ that make the matrix (A−λI) singular. This leads us directly to the condition:
det(A−λI)=0
This equation is called the characteristic equation of the matrix A.
When you compute the determinant det(A−λI), you'll find that it results in a polynomial in the variable λ. This polynomial is known as the characteristic polynomial. The degree of this polynomial is equal to the dimension of the square matrix A. The roots of the characteristic polynomial are precisely the eigenvalues of the matrix A.
Let's find the characteristic equation for a general 2x2 matrix:
A=[acbd]
First, form the matrix (A−λI):
A−λI=[acbd]−λ[1001]=[a−λcbd−λ]
Next, calculate the determinant of this matrix:
det(A−λI)=(a−λ)(d−λ)−(b)(c)
Finally, set the determinant equal to zero to get the characteristic equation:
(a−λ)(d−λ)−bc=0
Expanding this gives:
ad−aλ−dλ+λ2−bc=0
λ2−(a+d)λ+(ad−bc)=0
Notice that (a+d) is the trace of matrix A (sum of diagonal elements), denoted tr(A), and (ad−bc) is the determinant of A, denoted det(A). So, for a 2x2 matrix, the characteristic equation is always:
λ2−tr(A)λ+det(A)=0
This is a quadratic equation in λ. Solving this equation (using the quadratic formula, for instance) will give you the eigenvalues of the 2x2 matrix A. For an n×n matrix, you would obtain an nth-degree polynomial, and finding its roots yields the n eigenvalues (which may be real or complex, and may not all be distinct).
While solving the characteristic equation is the standard theoretical way to find eigenvalues, numerically finding the roots of high-degree polynomials can be challenging and prone to instability for large matrices. In practice, computational libraries like NumPy use more sophisticated and stable iterative algorithms (often based on matrix factorizations like QR decomposition) to find eigenvalues and eigenvectors, especially for larger matrices encountered in machine learning. However, understanding the characteristic equation provides the essential theoretical foundation for what these algorithms achieve.
© 2025 ApX Machine Learning