LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.2106.09685 - Introduces LoRA, a parameter-efficient fine-tuning technique that has become a cornerstone of adapting LLMs without full retraining, directly relevant to LLMOps methodologies.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksa Gordić, Armand Joulin, and Sebastian Riedel, 2020Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vol. 33 (NeurIPS)DOI: 10.48550/arXiv.2005.11477 - Presents the Retrieval-Augmented Generation (RAG) framework, combining information retrieval with generative models, which is a key architectural pattern in LLM applications.