Generating Data for Few-Shot and Zero-Shot Learning Scenarios
Was this section helpful?
Language Models are Few-Shot Learners, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei, 2020Advances in Neural Information Processing SystemsDOI: 10.48550/arXiv.2005.14165 - This paper introduces GPT-3 and demonstrates the few-shot and zero-shot learning capabilities of large language models, establishing the foundation for in-context learning.
Finetuned Language Models Are Zero-Shot Learners, Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le, 2021International Conference on Learning RepresentationsDOI: 10.48550/arXiv.2109.01652 - Introduces instruction tuning (FLAN), a method to fine-tune language models on a set of tasks framed as natural language instructions, improving their zero-shot generalization.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou, 2022Advances in Neural Information Processing SystemsDOI: 10.48550/arXiv.2201.11903 - This paper proposes Chain-of-Thought prompting, which enhances reasoning abilities of LLMs by guiding them to output intermediate reasoning steps, directly relevant to generating CoT examples for FSL.