Home
Blog
Courses
LLMs
EN
All Courses
Data Versioning and Experiment Tracking for Machine Learning
Chapter 1: The Need for Reproducibility in Machine Learning
Challenges in Managing ML Projects
Why Git Alone Is Not Sufficient
Defining Reproducibility in ML
Components of a Reproducible ML Workflow
Introduction to Data Versioning Concepts
Introduction to Experiment Tracking Concepts
Quiz for Chapter 1
Chapter 2: Versioning Data with DVC
Data Versioning Strategies
Introducing Data Version Control (DVC)
Setting Up DVC in a Project
Tracking Data Files and Directories
Storing and Retrieving Data Versions
Connecting DVC to Remote Storage (S3, GCS, Azure Blob)
Switching Between Data Versions
Hands-on Practical: Versioning a Dataset
Quiz for Chapter 2
Chapter 3: Tracking Experiments with MLflow
The Importance of Experiment Tracking
Introducing MLflow Tracking
Setting up MLflow
Logging Parameters and Metrics
Logging Artifacts (Models, Plots, Files)
Organizing Runs with Experiments
Using the MLflow UI
Comparing Experiment Runs
Practice: Tracking a Training Run
Chapter 4: Integrating DVC and MLflow for Reproducible Workflows
Connecting Data Versions to Experiments
Structuring Projects for Integration
Logging DVC Metadata in MLflow
Creating DVC Pipelines
Reproducing DVC Pipelines
Tracking DVC Pipeline Metrics
Combining DVC Pipelines and MLflow Tracking
Best Practices for Integrated Workflows
Hands-on Practical: Building an Integrated Pipeline
Reproducing DVC Pipelines
Was this section helpful?
Helpful
Report Issue
Mark as Complete
© 2025 ApX Machine Learning
Reproducing DVC Pipelines