Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Jian Pei, 2011 (Elsevier) - This book is a widely recognized resource for data mining, with a dedicated chapter on data preprocessing that covers data cleaning, including strategies and the rationale for addressing inconsistencies and duplicates to ensure data quality for analysis.
Data Quality: The Ten Dimensions That Every Data Professional Needs to Know, James D. Price, 2017 (Technics Publications) - This book provides a practical framework for understanding and managing data quality, detailing how issues like duplicate records undermine data integrity and affect business operations and analytical outcomes across multiple dimensions.