Model Cards for Model Reporting: Enabling Accountable AI Development, Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru, 2019Proceedings of the Conference on Fairness, Accountability, and Transparency (ACM)DOI: 10.1145/3287560.3287596 - 提出“模型卡片”概念的奠基性论文,提供了一种结构化方法来记录AI模型特征、性能和伦理考量,以提高透明度和问责制。
System Cards: Documentation for Complex AI Systems, Maithra Raghu, El Mehdi Faiq, Emily Dinan, Kevin Clark, Alex Hanna, Jamell Dacon, Christina Greer, Joelle Evans, Brian Clinton, Irshad Bhat, Joshua Buyco, Maura Grossman, Kristen Johnson, Michael R. Smith, Jason C. Smith, Eric D. Smith, Ankur Taly, Mahima Suresh, Andrew Zaldivar, 2022Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22) (ACM)DOI: 10.1145/3531146.3534575 - 提出“系统卡片”概念,扩展了模型卡片,为复杂AI系统的设计、开发和部署提供文档框架,对大型语言模型及其防护措施至关重要。
Constitutional AI: Harmlessness from AI Feedback, Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan, 2022arXiv preprint arXiv:2212.08073DOI: 10.48550/arXiv.2212.08073 - 描述了通过文档化和应用一套原则(即“宪法”)来使大型语言模型与人类价值观对齐的方法,与记录对齐目标直接相关。