LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.2106.09685 - 本文介绍了低秩适配(LoRA),这是一种著名的PEFT技术。它详细阐述了添加小的、可训练的低秩矩阵如何显著减少微调参数的数量,从而解决了本节所述的计算、内存和存储挑战。
Parameter-Efficient Transfer Learning for NLP, Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly, 2019International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1902.00751 - 这篇基础论文介绍了适配器模块,展示了在预训练模型中添加少量新参数,如何在大幅减少可训练参数的同时,达到与全量微调相当的性能。