Automated Quality Assurance for Synthetic Datasets
Was this section helpful?
Machine Learning Systems Design, Chip Huyen, 2022 (O'Reilly Media, Inc.) - This book covers principles for designing and implementing robust machine learning systems, including data validation, monitoring, and quality control as integral parts of the MLOps pipeline.
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜, Bender, Emily M. and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret, 2021Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Association for Computing Machinery)DOI: 10.1145/3442188.3445922 - This influential paper discusses ethical and societal risks of large language models, underscoring the importance of data governance, including identifying and mitigating biases and harmful content in training datasets.