LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.2106.09685 - This paper introduces Low-Rank Adaptation (LoRA), a prominent PEFT technique. It details how adding small, trainable low-rank matrices significantly reduces the number of fine-tuned parameters, addressing computational, memory, and storage challenges described in the section.
Parameter-Efficient Transfer Learning for NLP, Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly, 2019International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1902.00751 - This foundational paper introduces adapter modules, demonstrating how adding a small number of new parameters to pre-trained models can achieve performance comparable to full fine-tuning with substantially fewer trainable parameters.
PEFT: Parameter-Efficient Fine-tuning of Foundation Models, Hugging Face, 2024 (Hugging Face) - Official documentation for the Hugging Face PEFT library, offering practical insights and implementations of various PEFT methods. It serves as an excellent resource for understanding the practical application and benefits discussed.