Model Cards for Model Reporting: Enabling Accountable AI Development, Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru, 2019Proceedings of the Conference on Fairness, Accountability, and Transparency (ACM)DOI: 10.1145/3287560.3287596 - Foundational paper introducing 'Model Cards' as a structured approach for documenting AI model characteristics, performance, and ethical considerations to enhance transparency and accountability.
Artificial Intelligence Risk Management Framework (AI RMF 1.0), National Institute of Standards and Technology (NIST), 2023 (U.S. Department of Commerce) - Authoritative framework outlining voluntary guidance for managing risks to individuals, organizations, and society associated with AI, emphasizing governance, transparency, and accountability through documentation.
System Cards: Documentation for Complex AI Systems, Maithra Raghu, El Mehdi Faiq, Emily Dinan, Kevin Clark, Alex Hanna, Jamell Dacon, Christina Greer, Joelle Evans, Brian Clinton, Irshad Bhat, Joshua Buyco, Maura Grossman, Kristen Johnson, Michael R. Smith, Jason C. Smith, Eric D. Smith, Ankur Taly, Mahima Suresh, Andrew Zaldivar, 2022Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22) (ACM)DOI: 10.1145/3531146.3534575 - Introduces 'System Cards' as an extension of Model Cards, providing a framework for documenting the design, development, and deployment of complex AI systems, which is crucial for LLMs and their guardrails.
Constitutional AI: Harmlessness from AI Feedback, Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan, 2022arXiv preprint arXiv:2212.08073DOI: 10.48550/arXiv.2212.08073 - Describes Constitutional AI, a method for aligning LLMs with human values without human feedback, by documenting and applying a 'constitution' of principles, directly relevant to documenting alignment goals.