Challenges and Approaches for Mitigating Bias and Harm in Large Language Models, Laura Weidinger, John Mellor, Maribeth Smyth, Tom Mellor, Dinah Gloor, Laura Hughes, Leslie Garcia-Amaya, Matthew N. Rahtz, Jonathan F. Simon, Hannah Sheahan, Mario Lucic, Peter S. Park, Javier Snape, Manu Saraswat, M. F. W. Ver Steeg, Geoffrey Irving, Iason Gabriel, 2021Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35 (AAAI Press)DOI: 10.1609/aaai.v35i17.17709 - 全面概述了大型语言模型中偏见和危害的挑战,并讨论了各种缓解方法和评估技术。
Fairness in Machine Learning: A Survey, Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, Aram Galstyan, 2021ACM Computing Surveys (CSUR), Vol. 54 (Association for Computing Machinery (ACM))DOI: 10.1145/3457607 - 提供了一份关于机器学习中公平性定义、偏见类型和缓解技术的广泛调查,为理解大型语言模型相关概念提供了基础。