Revisiting Vector Embeddings and Search Fundamentals
Was this section helpful?
Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, 2008 (Cambridge University Press) - This textbook provides fundamental insights into vector space models, similarity metrics, and the principles of information retrieval and semantic search, explaining how text is represented as vectors and similarity is measured.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Nils Reimers and Iryna Gurevych, 2019Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (Association for Computational Linguistics)DOI: 10.18653/v1/D19-1410 - This paper introduces Sentence-BERT, a model for generating high-quality sentence embeddings directly applicable to semantic similarity tasks in modern LLM applications, addressing how contextualized embeddings are derived.
Approximate Nearest Neighbors: Towards Optimal Solutions, Piotr Indyk, Rajeev Motwani, 1998Proceedings of the thirtieth annual ACM symposium on Theory of computing (Association for Computing Machinery (ACM))DOI: 10.1145/276698.276876 - A seminal theoretical paper that establishes the foundations for Approximate Nearest Neighbor (ANN) search, outlining the computational challenges of exact search in high-dimensional spaces and motivating approximate methods.