As highlighted in the chapter introduction, performing an exact nearest neighbor search across millions or billions of high-dimensional vectors is often too slow for practical applications like real-time semantic search or Retrieval-Augmented Generation (RAG). Hierarchical Navigable Small Worlds (HNSW) emerges as a highly effective and widely adopted Approximate Nearest Neighbor (ANN) algorithm, offering an excellent balance between search speed, recall (accuracy), and memory usage. It's a graph-based algorithm inspired by the principles of Navigable Small Worlds (NSW), but with a significant enhancement: a hierarchical structure.
Imagine trying to find a specific address in a large country. You wouldn't drive down every single local street. Instead, you'd likely use highways to get close quickly, then regional roads, and finally local streets for the last part of the trip. HNSW employs a similar strategy using a multi-layer graph.
Simplified visualization of an HNSW graph structure with three layers. Connections exist within layers (solid lines) and points also exist on lower layers (dashed lines indicating presence, not direct inter-layer edges). Searches typically start at the top layer entry point(s) (E) and navigate down.
Building an HNSW graph is an incremental process where points are inserted one by one. When a new vector p is added:
mL
. A higher mL
leads to sparser upper layers.
P(layer)∝e−layer/mLefConstruction
): During this search-for-neighbors phase at construction time, a dynamic list (priority queue) of the closest candidates found so far is maintained. The size of this list is controlled by the parameter efConstruction
. A larger efConstruction
means exploring more potential neighbors at each step, leading to a potentially higher-quality graph structure (better recall later) but increasing the index build time.efConstruction
), connections are established between p and a subset of these neighbors.
Mmax0
might be used for Layer 0, often set higher (e.g., 2M) to improve connectivity at the base layer.This layered construction with controlled connectivity (M) and guided search (efConstruction
) aims to create a graph that allows efficient navigation from coarse to fine-grained detail.
Searching for the k nearest neighbors of a query vector q mirrors the insertion process but focuses solely on finding the closest points without modifying the graph:
efSearch
): Similar to construction, maintain a dynamic priority queue of candidate nearest neighbors found so far during the search across layers. The size of this candidate list during search is controlled by the parameter efSearch
. This is a critical parameter for tuning the trade-off between search speed and accuracy (recall).
efSearch
must be at least k (the number of neighbors requested).efSearch
explores more paths in the graph, increasing the probability of finding the true nearest neighbors (higher recall) but taking more time (higher latency).efSearch
results in a faster search but might miss some true neighbors (lower recall), settling for "good enough" approximate neighbors.efSearch
criteria is complete (e.g., the priority queue cannot be improved further), the algorithm returns the k closest vectors found from the final candidate list.Effectively using HNSW often involves tuning a few important parameters:
M
(Max Connections per Layer): Controls the density of the graph within each layer (except potentially Layer 0).
M
increases graph connectivity, potentially improving recall and robustness. However, it also significantly increases the memory footprint of the index and the graph construction time.efConstruction
(Construction Candidate List Size): Controls the depth of the search performed during index building when finding neighbors for a new point.
efConstruction
leads to a better-quality graph structure (better recall during actual searches) but substantially increases index build time. It has less impact on index size compared to M
.efSearch
(Search Candidate List Size): Controls the depth of the search performed at query time.
efSearch
increases recall but slows down queries. Must be ≥k.mL
(Level Normalization Factor): Influences the probability distribution for assigning layers during construction.
Tuning these parameters is essential for optimizing HNSW for specific dataset characteristics and application requirements (e.g., prioritizing low latency vs. maximizing recall).
Advantages:
Disadvantages:
efConstruction
, can be time-consuming for very large datasets.M
, efConstruction
, and efSearch
can require careful experimentation and evaluation.HNSW stands as a cornerstone algorithm for modern vector search due to its impressive performance characteristics. Understanding its layered structure, construction process, and search mechanism is fundamental to building efficient and effective semantic search and RAG systems. In the following sections, we will explore other important ANN techniques like IVF and PQ, and later discuss how they can sometimes be combined with HNSW for further optimization.
© 2025 ApX Machine Learning