Scaling Instruction-Finetuned Transformers, Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei, 2022arXiv preprintDOI: 10.48550/arXiv.2210.11416 - Explores the effectiveness of instruction fine-tuning (SFT) on various tasks and models, providing understanding of how SFT enhances instruction-following capabilities.
Aligning Language Models to Follow Instructions, OpenAI, 2022 (OpenAI Blog) - An accessible blog post from OpenAI explaining the InstructGPT paper, providing a clear overview of the RLHF process and the role of SFT.