TFX: A TensorFlow-Based Production Machine Learning Platform, Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, Chiu-Yuen Koo, Lukasz Lew, Clemens Mewald, Akshay Naresh Modi, Neoklis Polyzotis, Sukriti Ramesh, Sudip Roy, Steven Euijong Whang, Martin Wicke, Jarek Wilkiewicz, Xin Zhang, Martin A. Zinkevich, 2017Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM (Association for Computing Machinery))DOI: 10.1145/3097983.3098021 - 介绍了TensorFlow Extended (TFX),详细说明了其组件,例如TensorFlow Data Validation (TFDV),该组件通过统计分析提供工具来检测数据异常和训练-服务偏差。
Machine Learning Design Patterns, Valliappa Lakshmanan, Sara Robinson, Michael Munn, 2020 (O'Reilly Media) - 本书提供了构建可靠机器学习系统的实用设计模式,包括特征工程一致性、数据验证和缓解训练-服务偏差的策略。
Hidden Technical Debt in Machine Learning Systems, D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison, 2015Advances in Neural Information Processing Systems, Vol. 28 (NeurIPS Proceedings) - 这篇被广泛引用的论文指出了机器学习系统特有的各种形式的技术债务,其中许多可能导致在线/离线偏差,例如数据依赖性和模型更新周期。