LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021arXiv preprint arXiv:2106.09685DOI: 10.48550/arXiv.2106.09685 - Introduces Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method that reduces trainable parameters while maintaining high task performance and allowing for zero inference latency overhead.
Prefix-Tuning: Optimizing Continuous Prompts for Generation, Xiang Lisa Li, Percy Liang, 2021Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)DOI: 10.18653/v1/2021.acl-long.49 - Proposes prefix-tuning, a parameter-efficient approach that optimizes a small continuous prefix to guide a frozen language model for text generation tasks, offering high efficiency.