Retrieval-Augmented Generation for Large Language Models: A Survey, Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, Haofen Wang, 2023arXiv preprint arXiv:2312.10997DOI: 10.48550/arXiv.2312.10997 - A comprehensive survey of Retrieval-Augmented Generation (RAG) systems, with sections dedicated to data preprocessing, retrieval strategies, and the effect of document chunking on RAG performance.