LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.2106.09685 - Presents LoRA, a method that greatly reduces the number of trainable parameters by injecting low-rank matrices into the transformer architecture.
Parameter-Efficient Transfer Learning for NLP, Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly, 2019International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1902.00751 - Introduces Adapter Tuning, an early PEFT technique where small, task-specific neural modules (adapters) are inserted into a pre-trained model.
Hugging Face PEFT Library Documentation, Hugging Face team (Hugging Face) - Official documentation for the Hugging Face PEFT library, offering practical guides and explanations for various parameter-efficient fine-tuning methods.