The Probabilistic Relevance Model: BM25 and Beyond, Stephen Robertson, Hugo Zaragoza, 2009Foundations and Trends in Information Retrieval, Vol. 3: No. 4 (Now Publishers)DOI: 10.1561/1500000019 - Explains the BM25 retrieval function, a lexical search method that complements semantic search by excelling in exact term matching.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, 2019Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Vol. 1 (Association for Computational Linguistics)DOI: 10.18653/v1/N19-1423 - Describes the BERT model and its subword tokenization, relevant to how dense embeddings are formed and their OOV handling, and implicitly, their specific limitations.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Nils Reimers and Iryna Gurevych, 2019Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (Association for Computational Linguistics)DOI: 10.18653/v1/D19-1410 - Introduces Sentence-BERT, a widely used method for generating dense sentence embeddings, essential for understanding their creation and their strengths and limitations in capturing specific semantic nuances.