Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksa Gordić, Vladimir Karpukhin, Myle Ott, Sebastian Riedel, Douwe van der Vaart, 2020Advances in Neural Information Processing Systems (NeurIPS 2020), Vol. 33 (Neural Information Processing Systems Foundation)DOI: 10.48550/arXiv.2005.11401 - This seminal paper introduced the Retrieval-Augmented Generation (RAG) framework, establishing the need for effective data preparation methods like text chunking for enhanced language model performance.
Text splitters, LangChain Documentation, 2024 - The official documentation provides explanations and implementation details for various text splitting strategies, including fixed-size and recursive methods, as used in practical RAG applications.