Data is the starting point for any analysis or machine learning project, but raw data often comes with problems. Errors, missing values, and inconsistencies can significantly skew results and undermine the reliability of any insights derived.
This chapter introduces the concepts of data cleaning and preprocessing. You will learn:
Understanding these fundamentals is the first step towards preparing reliable data for your projects.
1.1 What is Data Cleaning?
1.2 What is Data Preprocessing?
1.3 Common Sources of Dirty Data
1.4 Impact of Poor Data Quality
1.5 The Data Cleaning Workflow Overview
© 2025 ApX Machine Learning