Understanding and Countering Model Performance Degradation
Was this section helpful?
The Curse of Recursion: Training on Generated Data Makes Models Forget, Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson, 2023arXiv preprint arXiv:2305.17493DOI: 10.48550/arXiv.2305.17493 - Examines model performance degradation, known as model collapse, when models are iteratively trained on data generated by other models, explaining the mechanisms of knowledge loss and reduced output diversity.
Holistic Evaluation of Language Models, Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew J. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda, 2023Transactions on Machine Learning ResearchDOI: 10.48550/arXiv.2211.09110 - Introduces a comprehensive framework and extensive benchmark for evaluating large language models across a wide array of dimensions, which is essential for identifying and tracking performance degradation effectively.