Data Cleaning: Handling Missing Values and Outliers
Was this section helpful?
An Introduction to Statistical Learning: with Applications in R, Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, 2013 (Springer) - A foundational textbook covering statistical learning concepts, including essential data preprocessing, handling missing values, and dealing with outliers, which are crucial for preparing data for machine learning models.
DataFrames.jl Documentation, The DataFrames.jl Developers, 2024 - The official documentation for DataFrames.jl, providing comprehensive guides and API references for efficient data manipulation in Julia, with specific examples and functions for handling missing values and filtering data.
MLJ.jl Documentation: Data Processing and Preprocessing, The MLJ Developers, 2024 - Detailed information on data processing components within the MLJ.jl framework, including advanced imputation strategies and transformers that are relevant for building robust machine learning pipelines in Julia.
Data Cleaning: A Practical Approach, Laura Moncion, 2023 (Manning Publications) - A practical guide to methodologies and tools for effective data cleaning, covering techniques for identifying and resolving various data quality issues such as missing values and outliers in depth.
Outlier Detection: A Survey, Varun Chandola, Arindam Banerjee, and Vipin Kumar, 2009ACM Computing Surveys (CSUR), Vol. 41 (Association for Computing Machinery (ACM))DOI: 10.1145/1541880.1541882 - A comprehensive survey of outlier detection techniques, providing a foundational understanding of various statistical and machine learning approaches to identify anomalous data points.