All Courses

Data Versioning and Experiment Tracking for Machine Learning

Chapter 1: The Need for Reproducibility in Machine Learning

Challenges in Managing ML Projects

Why Git Alone Is Not Sufficient

Defining Reproducibility in ML

Components of a Reproducible ML Workflow

Introduction to Data Versioning Concepts

Introduction to Experiment Tracking Concepts

Quiz for Chapter 1

Chapter 2: Versioning Data with DVC

Data Versioning Strategies

Introducing Data Version Control (DVC)

Setting Up DVC in a Project

Tracking Data Files and Directories

Storing and Retrieving Data Versions

Connecting DVC to Remote Storage (S3, GCS, Azure Blob)

Switching Between Data Versions

Hands-on Practical: Versioning a Dataset

Quiz for Chapter 2

Chapter 3: Tracking Experiments with MLflow

The Importance of Experiment Tracking

Introducing MLflow Tracking

Setting up MLflow

Logging Parameters and Metrics

Logging Artifacts (Models, Plots, Files)

Organizing Runs with Experiments

Using the MLflow UI

Comparing Experiment Runs

Practice: Tracking a Training Run

Chapter 4: Integrating DVC and MLflow for Reproducible Workflows

Connecting Data Versions to Experiments

Structuring Projects for Integration

Logging DVC Metadata in MLflow

Creating DVC Pipelines

Reproducing DVC Pipelines

Tracking DVC Pipeline Metrics

Combining DVC Pipelines and MLflow Tracking

Best Practices for Integrated Workflows

Hands-on Practical: Building an Integrated Pipeline

Challenges in Managing ML Projects

Was this section helpful?

References

Machine Learning Engineering, Hannes Hapke, Catherine Nelson, Rahul Agarwal, 2020 (O'Reilly Media) - Provides a comprehensive guide to building and managing ML systems, discussing common challenges in its early chapters.
Reproducibility in machine learning: A critical review, Joelle Pineau, Ludovic Denoyer, Matthieu Labeau, Mark R. Lee, James MacGibbon, Pascal Van Hentenryck, 2020 ACM Computing Surveys, Vol. 53 (Association for Computing Machinery) DOI: 10.1145/3386266 - Academic review article discussing the scope and components of reproducibility in machine learning, covering data, code, and environment factors.
MLOps: Continuous delivery and automation for machine learning, Google Cloud, 2024 (Google Cloud) - Authoritative guide from a leading industry provider on MLOps practices, addressing the practical challenges of managing ML project lifecycles.

© 2025 ApX Machine Learning