While Product Quantization (PQ), discussed in the previous section, offers significant memory reduction by compressing vector subspaces independently, its effectiveness hinges on how the vector is initially split. Standard PQ typically splits the vector into contiguous blocks. If the data distribution has strong correlations across these arbitrary boundaries, or if variance is concentrated in only a few subspaces, this splitting strategy can be suboptimal, leading to higher quantization errors and reduced search accuracy.
Optimized Product Quantization (OPQ) addresses this limitation by introducing a preprocessing step: it learns an optimal rotation of the vector space before applying PQ. The goal is to transform the data so that the variance is distributed more evenly across the subspaces that PQ will create. This rotation aims to align the data distribution better with the subspace divisions, minimizing the information lost during quantization.
At the heart of OPQ is an orthogonal matrix R. An orthogonal matrix represents a rotation (or reflection) in space, preserving distances and angles between vectors. Before a vector x is quantized using PQ, it's first transformed by this matrix:
x′=RxThis rotated vector x′ is then split into sub-vectors, and standard PQ is applied to these sub-vectors. Why does this help? Imagine your data points form an elongated cloud in space. If you split this cloud along the original axes using standard PQ, some sub-quantizers might deal with very spread-out data (high variance), while others deal with tightly clustered data (low variance), leading to inefficient codebook usage. OPQ learns a rotation R that ideally orients this cloud so that when x′=Rx is split, each resulting subspace captures a similar amount of variance, allowing the PQ codebooks for each subspace to be used more effectively.
Comparison showing OPQ adding a rotation step before standard PQ processing.
The rotation matrix R isn't arbitrary; it's learned from a representative sample of the training data. The objective is to find the orthogonal matrix R that minimizes the overall quantization error after applying PQ to the rotated vectors Rx.
This learning process is typically iterative:
This iterative process fine-tunes both the rotation and the PQ codebooks together, leading to a combined transformation that better preserves the original vector information within the compressed representation.
Indexing: Once the rotation matrix R and the corresponding PQ codebooks are learned, indexing involves:
Search: At query time, the same rotation must be applied to the query vector q:
Libraries like Faiss abstract much of this complexity, allowing you to specify OPQ as a preprocessing step within an index configuration string.
# Conceptual Faiss Example demonstrating OPQ integration
import faiss
import numpy as np
d = 128 # Vector dimension
nb = 100000 # Database size
nt = 20000 # Training set size
nq = 1000 # Number of queries
# Generate some random data
xt = np.random.rand(nt, d).astype('float32') * 10
xb = np.random.rand(nb, d).astype('float32') * 10
xq = np.random.rand(nq, d).astype('float32') * 10
# PQ parameters
m = 16 # Number of subspaces
bits = 8 # Bits per code
# IVF parameters (often used with OPQ/PQ)
nlist = 256 # Number of coarse quantization cells
# Faiss index factory string incorporating OPQ
# Format: "OPQ<m_opq>,[IVF<nlist>,]PQ<m>x<bits>"
# m_opq specifies the number of subspaces for OPQ training rotation (can be same as m)
index_string = f"OPQ{m},IVF{nlist},PQ{m}x{bits}"
# Build the index
print(f"Building Faiss index: {index_string}")
index = faiss.index_factory(d, index_string, faiss.METRIC_L2)
# Check if index needs training
if not index.is_trained:
print("Training index...")
# Training learns the IVF centroids, the OPQ rotation matrix R,
# and the PQ codebooks based on the rotated data.
index.train(xt)
else:
print("Index does not require separate training step.") # e.g., for Flat indexes
print("Adding database vectors...")
# During add, vectors are automatically processed:
# 1. Rotated: x' = Rx
# 2. Assigned to IVF cell based on x'
# 3. PQ encoded based on residual (x' - centroid)
index.add(xb)
print(f"Index contains {index.ntotal} vectors.")
# Set IVF search parameter (number of cells to visit)
index.nprobe = 16 # Higher nprobe means better accuracy, slower search
print("Performing search...")
k = 5 # Find 5 nearest neighbors
# During search, the query vector q is:
# 1. Rotated: q' = Rq
# 2. IVF cells selected based on q'
# 3. Distances computed using PQ codes in selected cells
distances, indices = index.search(xq, k)
print(f"Search complete. Example results for first query:")
print(f"Indices: {indices[0]}")
print(f"Distances: {distances[0]}")
Implementing OPQ involves balancing its benefits against its costs:
OPQ is particularly beneficial when:
For datasets where dimensions are already reasonably decorrelated or when index build time is highly constrained, the gains from OPQ might not outweigh its costs compared to simpler techniques like standard PQ. As with many optimization techniques, evaluating its effectiveness on your specific data and application requirements through experimentation (covered in Chapter 5) is essential.
© 2025 ApX Machine Learning