You've successfully implemented vector search and keyword search, obtaining two distinct lists of ranked documents for a given query. How do you combine these lists into a single, coherent ranking that leverages the strengths of both semantic relevance (from vectors) and term matching (from keywords)? This is the challenge addressed by result fusion algorithms. Simply concatenating or alternating results often produces suboptimal rankings. Effective fusion requires a principled approach to merge the information contained in the individual ranked lists.
Several techniques exist, ranging from simple heuristics to more sophisticated methods. A common challenge is that the relevance scores produced by different systems (e.g., cosine similarity for vectors, BM25 scores for keywords) are often not directly comparable. Vector similarities might range from -1 to 1 or 0 to 1, while BM25 scores can be unbounded positive numbers. Fusion methods that rely directly on these scores often require careful normalization, which can be difficult to get right.
Reciprocal Rank Fusion (RRF) offers an elegant solution that sidesteps the score normalization problem by focusing solely on the rank of each document within the individual result lists. It operates on the principle that documents consistently ranked higher across multiple systems are likely more relevant.
The RRF score for a document d is calculated by summing the reciprocal of its rank in each list, adjusted by a constant k:
RRF_Score(d)=i=1∑Nk+ranki(d)1Where:
How RRF Works
Imagine you have results from a vector search (List V) and a keyword search (List K):
Let's calculate the RRF scores using k=60:
Document | Rank (List V) | Rank (List K) | Contribution V (1/(60+rank)) | Contribution K (1/(60+rank)) | RRF Score (Sum) |
---|---|---|---|---|---|
DocA | 1 | 3 | 1/61 ≈ 0.01639 | 1/63 ≈ 0.01587 | 0.03226 |
DocB | 2 | 1 | 1/62 ≈ 0.01613 | 1/61 ≈ 0.01639 | 0.03252 |
DocC | 3 | Not present | 1/63 ≈ 0.01587 | 0 | 0.01587 |
DocD | Not present | 2 | 0 | 1/62 ≈ 0.01613 | 0.01613 |
The final fused list is ordered by descending RRF score: DocB, DocA, DocD, DocC.
RRF calculation results for the example documents, showing DocB slightly outscoring DocA due to its higher rank in one list despite DocA being present in both.
Advantages of RRF:
Considerations for RRF:
While RRF is a popular and effective choice, other fusion methods exist:
Score-Based Combination (CombSUM, CombMNZ): These methods attempt to combine the original relevance scores.
Rank-Based Combination (Borda Count): Similar to RRF, Borda Count uses ranks. For a list of size M, it assigns points based on rank: M points for rank 1, M−1 points for rank 2, ..., 1 point for rank M. Scores are summed across lists for each document. Like RRF, it avoids score normalization but uses a different weighting scheme based on rank position.
Weighted Fusion: A straightforward approach is to assign weights to each search system and compute a weighted sum of (normalized) scores or ranks.
WeightedScore(d)=i=1∑Nwi×normalized_scorei(d)Or using ranks (similar to weighted RRF):
WeightedRankScore(d)=i=1∑Nwi×k+ranki(d)1The challenge lies in determining the optimal weights (wi). These might be set heuristically or, more systematically, learned using machine learning techniques (often called "Learning to Rank" or LTR), where a model is trained on labeled data to predict the best combination. LTR is powerful but significantly more complex to implement and maintain.
For many hybrid search applications, RRF provides a strong baseline due to its simplicity and robustness. It's often the recommended starting point.
When implementing fusion:
Fusion adds a computational step after retrieving results from individual systems. While typically fast compared to the search itself, ensure the implementation is efficient, especially for latency-sensitive applications. By thoughtfully combining results using methods like RRF, you can build hybrid search systems that significantly outperform any single retrieval method alone.
© 2025 ApX Machine Learning