TensorFlow Extended (TFX) documentation, Google, 2024 - A comprehensive guide to TFX, explaining its architecture and core components like ExampleGen, StatisticsGen, SchemaGen, and ExampleValidator.
Hidden Technical Debt in Machine Learning Systems, D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison, 2015Advances in Neural Information Processing Systems, Vol. 28 (Neural Information Processing Systems) - A foundational paper discussing the challenges of building and maintaining production ML systems, highlighting data dependencies and system-level issues that TFX components address.
TensorFlow Data Validation (TFDV) documentation, Google, 2024 (Google) - Provides detailed explanations of data statistics computation, schema generation, and anomaly detection using TFDV, the underlying library TFX components leverage for data validation.
Building Machine Learning Pipelines, Hannes Hapke, Catherine Nelson, Rahul Raizada, 2020 (O'Reilly Media) - A practical book covering the design and implementation of production ML pipelines with TFX, including thorough discussions on data ingestion, validation, and schema management.