We've seen how to represent vectors and perform basic operations like addition and scalar multiplication. But often, we need a way to measure the "size" or "length" of a vector. Think about a simple vector like v=[3,4]. Geometrically, this represents an arrow starting at the origin (0,0) and ending at the point (3,4) in a 2D plane. How long is this arrow? We can use the Pythagorean theorem: 32+42=9+16=25=5. This intuitive idea of length is formalized using the concept of a norm.
A norm is a function that assigns a strictly positive length or magnitude to each vector in a vector space, with the exception of the zero vector, which has a length of zero. There isn't just one way to measure length. We'll look at two of the most common norms used in machine learning: the L2 norm and the L1 norm.
The most common way to measure vector length is the L2 norm, also known as the Euclidean norm. This corresponds exactly to our intuitive understanding of distance in Euclidean space (the straight-line distance).
For a vector v with n elements, v=[v1,v2,…,vn], the L2 norm is calculated as the square root of the sum of the squared vector elements:
∥v∥2=v12+v22+⋯+vn2=i=1∑nvi2Notice how for our 2D example v=[3,4], this formula gives 32+42=5, matching the Pythagorean theorem.
In NumPy, you can calculate the L2 norm using the np.linalg.norm()
function. By default, this function calculates the L2 norm.
import numpy as np
# Define a vector
v = np.array([3, 4])
# Calculate the L2 norm
l2_norm = np.linalg.norm(v)
print(f"Vector: {v}")
print(f"L2 Norm (Euclidean Length): {l2_norm}")
# Another example (3D vector)
w = np.array([1, 2, -2])
l2_norm_w = np.linalg.norm(w)
# Calculation: sqrt(1^2 + 2^2 + (-2)^2) = sqrt(1 + 4 + 4) = sqrt(9) = 3
print(f"\nVector: {w}")
print(f"L2 Norm: {l2_norm_w}")
The output will be:
Vector: [3 4]
L2 Norm (Euclidean Length): 5.0
Vector: [ 1 2 -2]
L2 Norm: 3.0
The L2 norm is widely used in machine learning, for instance, in regularization techniques (like Ridge regression) to penalize large coefficient values and in calculating distance metrics.
Another useful way to measure the size of a vector is the L1 norm, sometimes called the Manhattan norm or Taxicab norm. Instead of squaring the elements, the L1 norm sums the absolute values of the elements.
For the same vector v=[v1,v2,…,vn], the L1 norm is calculated as:
∥v∥1=∣v1∣+∣v2∣+⋯+∣vn∣=i=1∑n∣vi∣Why "Manhattan norm"? Imagine you are in a city like Manhattan where streets form a grid. To get from point A to point B, you can't travel in a straight line (like the L2 norm measures). You have to travel along the streets (horizontally and vertically). The L1 norm represents this total distance traveled along the grid axes.
For our example vector v=[3,4], the L1 norm is ∣3∣+∣4∣=3+4=7.
To calculate the L1 norm in NumPy, you use the same np.linalg.norm()
function but specify the order parameter ord=1
.
import numpy as np
# Define a vector
v = np.array([3, 4])
# Calculate the L1 norm
l1_norm = np.linalg.norm(v, ord=1)
print(f"Vector: {v}")
print(f"L1 Norm (Manhattan Length): {l1_norm}")
# Another example (with negative values)
w = np.array([1, -2, -2])
l1_norm_w = np.linalg.norm(w, ord=1)
# Calculation: |1| + |-2| + |-2| = 1 + 2 + 2 = 5
print(f"\nVector: {w}")
print(f"L1 Norm: {l1_norm_w}")
The output will be:
Vector: [3 4]
L1 Norm (Manhattan Length): 7.0
Vector: [ 1 -2 -2]
L1 Norm: 5.0
The L1 norm is also significant in machine learning. It tends to produce sparse results (meaning many values are zero) when used in optimization contexts, such as in Lasso regression for feature selection.
The difference between the L1 and L2 norms is easiest to see geometrically in 2D. The L2 norm is the direct "as the crow flies" distance, while the L1 norm is the distance traveled along the grid lines.
Visualization comparing the L1 (Manhattan, dashed orange) and L2 (Euclidean, solid blue) paths for the vector [3, 4] from the origin (0,0). The L2 norm represents the direct distance (5.0), while the L1 norm represents the distance traveled along the axes (3 + 4 = 7.0).
Understanding norms gives us a fundamental tool for measuring the magnitude of vectors, a concept that appears frequently when working with data and machine learning algorithms. While L1 and L2 are the most common, other norms exist (like the Lp norm, a generalization, or the L∞ norm, the maximum absolute value component), but L1 and L2 provide a solid foundation.
© 2025 ApX Machine Learning