Model Cards for Model Reporting, Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru, 2019Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*) (Association for Computing Machinery)DOI: 10.1145/3287560.3287596 - 提出了模型卡的概念,作为一种标准化的报告框架,旨在促进人工智能开发的透明度和责任。
Model cards, Hugging Face, 2024 - 官方文档解释了Hugging Face Hub上的模型卡是什么,以及如何创建和解读它们。
HELM: Holistic Evaluation of Language Models, Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda, 2023Transactions on Machine Learning Research (TMLR)DOI: 10.48550/arXiv.2211.09110 - 提出了一个用于全面评估语言模型的框架,涵盖了广泛的指标,为模型卡中提及的简单基准分数提供了更深入的见解。