Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela, 2020Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vol. 33 (Curran Associates, Inc.)DOI: 10.48550/arXiv.2005.11401 - Defines the original Retrieval-Augmented Generation (RAG) framework, detailing its architecture and benefits for incorporating external knowledge into language models.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Nils Reimers and Iryna Gurevych, 2019Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (Association for Computational Linguistics)DOI: 10.18653/v1/D19-1410 - Introduces the Sentence-BERT model, which efficiently generates high-quality sentence embeddings, forming the basis for libraries like sentence-transformers used in the retriever.
ChromaDB Documentation, Chroma, 2024 - Official guide for Chroma, an open-source embedding database. It provides details on installation, client usage, collection management, and querying, relevant to implementing the retriever.
Sentence-Transformers Documentation, Nils Reimers, Iryna Gurevych, 2024 - Official guide for the sentence-transformers Python library, detailing how to load pre-trained models and generate sentence embeddings, crucial for converting queries into vectors.