While vector addition, subtraction, and scalar multiplication change a vector's position or scale, the dot product (also known as the scalar product or inner product) provides a way to "multiply" two vectors, yielding a single scalar number. This scalar tells us about the relationship between the directions of the two vectors, making it fundamental for measuring similarity and understanding geometric relationships in feature spaces.
For two vectors a=[a1,a2,...,an] and b=[b1,b2,...,bn] in Rn, their dot product, denoted as a⋅b or aTb, is calculated by multiplying corresponding elements and summing the results:
a⋅b=∑i=1naibi=a1b1+a2b2+⋯+anbn
The result is always a scalar, not a vector.
Example: Let a=[1,2,3] and b=[−2,0,4]. Their dot product is: a⋅b=(1×−2)+(2×0)+(3×4)=−2+0+12=10
In Python using NumPy, you can calculate the dot product efficiently using the numpy.dot
function, the dot
method of a NumPy array, or the @
operator (for Python 3.5+).
import numpy as np
a = np.array([1, 2, 3])
b = np.array([-2, 0, 4])
# Method 1: np.dot()
dot_product_1 = np.dot(a, b)
# Method 2: .dot() method
dot_product_2 = a.dot(b)
# Method 3: @ operator
dot_product_3 = a @ b
print(f"Dot product using np.dot: {dot_product_1}")
print(f"Dot product using .dot method: {dot_product_2}")
print(f"Dot product using @ operator: {dot_product_3}")
# Expected output: 10 for all methods
The dot product has a powerful geometric interpretation related to the angle θ between the two vectors (when placed tail-to-tail):
a⋅b=∣∣a∣∣2∣∣b∣∣2cos(θ)
Here, ∣∣a∣∣2 and ∣∣b∣∣2 are the Euclidean (L2) norms (magnitudes) of the vectors a and b. This formula connects the algebraic calculation (∑aibi) to the geometry (lengths and the angle between vectors).
We can rearrange this formula to find the angle:
cos(θ)=∣∣a∣∣2∣∣b∣∣2a⋅b θ=arccos(∣∣a∣∣2∣∣b∣∣2a⋅b)
The sign of the dot product tells us about the angle θ:
The geometric interpretation directly leads to a common measure of similarity between vectors used in machine learning: cosine similarity. It measures the cosine of the angle between two non-zero vectors.
Cosine Similarity(a,b)=cos(θ)=∣∣a∣∣2∣∣b∣∣2a⋅b
Cosine similarity ranges from -1 to 1:
This measure is widely used in areas like natural language processing (NLP) to compare document embeddings or word vectors, and in recommendation systems to find similar users or items, because it focuses on the orientation (direction) of the vectors rather than their magnitudes. For instance, two documents discussing the same topic might have vector representations pointing in a similar direction, even if one document is much longer (larger vector magnitude) than the other.
import numpy as np
# Example: Document vectors (e.g., word counts)
doc1 = np.array([2, 1, 0, 3]) # Counts for words A, B, C, D
doc2 = np.array([4, 2, 1, 6]) # Similar topic, longer document
doc3 = np.array([0, 0, 5, 1]) # Different topic
def cosine_similarity(vec1, vec2):
dot_prod = vec1 @ vec2
norm1 = np.linalg.norm(vec1) # L2 norm by default
norm2 = np.linalg.norm(vec2)
if norm1 == 0 or norm2 == 0:
return 0 # Avoid division by zero if a vector is zero
return dot_prod / (norm1 * norm2)
sim_1_2 = cosine_similarity(doc1, doc2)
sim_1_3 = cosine_similarity(doc1, doc3)
print(f"Similarity between doc1 and doc2: {sim_1_2:.4f}") # Should be close to 1
print(f"Similarity between doc1 and doc3: {sim_1_3:.4f}") # Should be much lower
# Expected output (approx):
# Similarity between doc1 and doc2: 0.9990
# Similarity between doc1 and doc3: 0.1705
Another key application of the dot product is calculating the projection of one vector onto another. Imagine shining a light source perpendicularly onto vector b; the shadow cast by vector a onto the line containing b is the vector projection of a onto b. This projection tells us "how much" of vector a points in the direction of vector b.
Projection of vector a onto vector b. The dashed purple vector
proj_b(a)
is the component of a that lies along the direction of b.
There are two related quantities:
Scalar Projection: This is the length of the vector projection. It's the signed magnitude of the "shadow". From trigonometry (∣∣a∣∣2cos(θ)) and the dot product formula (a⋅b=∣∣a∣∣2∣∣b∣∣2cos(θ)), we get: scalar projection of a onto b=∣∣a∣∣2cos(θ)=∣∣b∣∣2a⋅b The sign indicates whether the projection points in the same (+) or opposite (-) direction as b.
Vector Projection: This is the actual vector representing the projection. To get this vector, we take the scalar projection (the length) and multiply it by the unit vector in the direction of b. The unit vector in the direction of b is ∣∣b∣∣2b. projb(a)=(scalar projection)×(unit vector of b) projb(a)=(∣∣b∣∣2a⋅b)∣∣b∣∣2b=∣∣b∣∣22a⋅bb Since ∣∣b∣∣22=b⋅b, a common alternative form is: projb(a)=(b⋅ba⋅b)b
Projections are essential in algorithms that need to decompose vectors into components, such as the Gram-Schmidt process for creating orthogonal bases, and they are implicitly used in dimensionality reduction techniques like Principal Component Analysis (PCA), where data is projected onto directions of maximum variance.
Example Calculation: Let a=[2,3] and b=[4,0]. a⋅b=(2)(4)+(3)(0)=8 b⋅b=(4)(4)+(0)(0)=16 ∣∣b∣∣2=16=4
Scalar projection of a onto b=∣∣b∣∣2a⋅b=48=2. Vector projection of a onto b=(b⋅ba⋅b)b=(168)[4,0]=21[4,0]=[2,0].
This makes sense intuitively: vector a=[2,3] has a component of length 2 along the x-axis (the direction of b=[4,0]).
import numpy as np
a = np.array([2, 3])
b = np.array([4, 0])
# Scalar projection
scalar_proj = (a @ b) / np.linalg.norm(b)
# Vector projection
vector_proj = ( (a @ b) / (b @ b) ) * b
# Alternative for vector projection using the scalar projection:
# unit_b = b / np.linalg.norm(b)
# vector_proj_alt = scalar_proj * unit_b
print(f"Vector a: {a}")
print(f"Vector b: {b}")
print(f"Scalar projection of a onto b: {scalar_proj:.4f}")
print(f"Vector projection of a onto b: {vector_proj}")
# Expected output:
# Scalar projection of a onto b: 2.0000
# Vector projection of a onto b: [2. 0.]
The dot product and projections are indispensable tools. They allow us to move beyond simple vector arithmetic to analyze the geometric relationships between vectors, measure similarity, and decompose vectors into meaningful components, all of which are frequent operations when working with data in machine learning.
© 2025 ApX Machine Learning