Vector Norms: Measuring Length

While we often think of vectors as arrows pointing in space, a critical question is, "How long is that arrow?" This "length" or "magnitude" is a fundamental property of a vector, and in linear algebra, we use a function called a norm to measure it. A norm takes a vector as input and returns a single, non-negative number representing its size.

Understanding norms is important because they form the basis for many machine learning applications, from measuring error in model predictions to regularizing models to prevent overfitting. We will look at the two most common vector norms: the L2 norm and the L1 norm.

The L2 Norm: Euclidean Distance

The L2 norm is what most people intuitively think of as "distance." It calculates the length of a vector as a straight line from the origin to its endpoint. You might remember this from geometry as the Pythagorean theorem. For a 2D vector v = [x, y], its L2 norm is $\sqrt{x^2 + y^2}$ . This idea extends to any number of dimensions.

The formula for the L2 norm of a vector $v$ with $n$ components is:

||v||_2 = \sqrt{v_1^2 + v_2^2 + \dots + v_n^2}

The double vertical bars $|| \cdot ||$ are the standard notation for a norm, and the subscript 2 indicates it's the L2 norm. Because it's so common, the subscript is often omitted, so if you see $||v||$ , it usually refers to the L2 norm.

Let's take the vector $v = [3, 4]$ . Its L2 norm is:

||v||_2 = \sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5

This means the vector has a length of 5 units. Geometrically, it forms the hypotenuse of a right triangle with sides of length 3 and 4.

The L2 norm of the vector $[3, 4]$ is the straight-line distance from the origin, which is 5.

An interesting property is that the L2 norm is directly related to the dot product. The squared L2 norm of a vector is equal to its dot product with itself: $||v||_2^2 = v \cdot v$ .

The L1 Norm: Manhattan Distance

The L1 norm offers a different way to measure a vector's length. Instead of measuring the direct "as-the-crow-flies" distance, it sums the absolute values of each component. It's often called the Manhattan norm or taxicab distance. Imagine you are in a city with a grid-like street layout. You can't travel through buildings (diagonally); you must move along the streets (the axes).

The formula for the L1 norm of a vector $v$ is:

||v||_1 = |v_1| + |v_2| + \dots + |v_n| = \sum_{i=1}^{n} |v_i|

Let's use our same vector, $v = [3, 4]$ . Its L1 norm is calculated as:

||v||_1 = |3| + |4| = 7

Geometrically, this is the total distance you would travel to get from the origin to the point $(3, 4)$ by only moving along the x-axis and then the y-axis.

The L1 norm measures the path along the axes. For the vector $[3, 4]$ , the L1 distance is 7, whereas the L2 path is 5.

Why Different Norms Matter in Machine Learning

You might be wondering why we need more than one way to measure distance. In machine learning, different norms have different properties that make them useful for specific tasks.

L2 Norm: This is the most common norm for measuring error. For example, in linear regression, models are often trained by minimizing the L2 norm of the error vector (the difference between predicted and actual values). This is called minimizing the "sum of squared errors." The L2 norm is also used in a regularization technique called Ridge Regression, which penalizes large model weights, encouraging solutions where weight values are small and spread out.
L1 Norm: The L1 norm is also used for measuring error and for regularization. A regularization technique called Lasso Regression uses the L1 norm to penalize weights. The L1 norm has a unique property: it tends to produce sparse solutions, meaning it drives many model weights to become exactly zero. This effectively performs feature selection by telling you which features are not important for the prediction.

Calculating Norms with NumPy

As you would expect, NumPy provides a simple and efficient way to calculate vector norms using the numpy.linalg.norm() function. This function calculates the L2 norm by default, but you can specify other norms using the ord parameter.

Let's see it in action with a new vector, $v = [-5, 12]$ .

First, let's calculate the L2 norm: $||v||_2 = \sqrt{(-5)^2 + 12^2} = \sqrt{25 + 144} = \sqrt{169} = 13$

Next, the L1 norm: $||v||_1 = |-5| + |12| = 5 + 12 = 17$

Now, let's verify these results with NumPy.

import numpy as np

# Create our example vector
v = np.array([-5, 12])

# Calculate the L2 norm (the default)
l2_norm = np.linalg.norm(v)

# Calculate the L1 norm by setting ord=1
l1_norm = np.linalg.norm(v, ord=1)

print(f"Vector: {v}")
print(f"L2 Norm (Euclidean): {l2_norm}")
print(f"L1 Norm (Manhattan): {l1_norm}")

Output:

Vector: [-5 12]
L2 Norm (Euclidean): 13.0
L1 Norm (Manhattan): 17.0

The NumPy results match our manual calculations perfectly. Being able to compute the "size" of vectors is a building block you will use frequently as you work with more complex machine learning algorithms.

Was this section helpful?

References

Introduction to Linear Algebra, Gilbert Strang, 2016 (Wellesley-Cambridge Press) - A widely used textbook that provides a comprehensive and accessible introduction to linear algebra, covering vector norms and their mathematical properties.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) DOI: 10.1007/978-0-387-84858-7 - An authoritative statistical learning book that explains the practical applications and theoretical underpinnings of L1 and L2 norms in machine learning regularization (Lasso and Ridge regression).
numpy.linalg.norm, NumPy Developers, 2023 - The official NumPy documentation for the linalg.norm function, providing details on its usage for calculating various vector and matrix norms.