Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, Wes McKinney, 2022 (O'Reilly Media) - An authoritative guide covering data manipulation, cleaning, and preparation using Pandas, including techniques for handling missing data, inconsistent formats, and data transformation.
Outlier Detection: A Survey, Varun Chandola, Arindam Banerjee, and Vipin Kumar, 2009ACM Computing Surveys, Vol. 41 (Association for Computing Machinery (ACM))DOI: 10.1145/1541880.1541882 - A highly cited survey paper offering a comprehensive overview and taxonomy of outlier detection methods, including statistical, proximity-based, and model-based approaches.
Data Cleaning, Ihab F. Ilyas, Xu Chu, 2019 Vol. 28 (Association for Computing Machinery and Morgan & Claypool Publishers)DOI: 10.1145/3342502 - An academic book providing a systematic treatment of modern data cleaning challenges and techniques, covering various data quality problems and their solutions.
Pandas User Guide: Working with Text Data, The Pandas Development Team, 2024 - Official documentation detailing Pandas' powerful string methods and regular expression capabilities for cleaning, standardizing, and transforming text data.