Machine learning models learn patterns from the features they are given. While the previous chapter focused on acquiring and cleaning data, the raw attributes themselves are often not in the most informative format for an algorithm. This chapter concentrates on feature engineering: the practice of using domain knowledge and specific techniques to create input variables that help machine learning algorithms perform better.
You will learn practical methods to generate new features from various data types:
We will also address how to manage the number of features, covering dimensionality reduction using Principal Component Analysis (PCA) and methods for selecting the most relevant features based on statistical significance. The goal is to construct a refined feature set that effectively captures the underlying patterns for your predictive models. Practical exercises will allow you to apply these techniques to prepare data for modeling.
2.1 Generating Features from Numerical Data
2.2 Encoding Categorical Variables Effectively
2.3 Creating Features from Text Data
2.4 Interaction Terms and Polynomial Features
2.5 Dimensionality Reduction with PCA
2.6 Selecting Features using Statistical Methods
2.7 Hands-on: Feature Creation and Selection
© 2025 ApX Machine Learning