Neural networks operate on numerical data, and their ability to learn effectively is significantly influenced by the format and characteristics of the input they receive. Raw datasets seldom arrive in a state ready for direct use. This chapter introduces the necessary steps to prepare and structure data for optimal neural network training.
You will learn about representing different types of input features, the importance of scaling numerical data using techniques like normalization (x′=max(x)−min(x)x−min(x)) and standardization (x′=σx−μ), and methods for converting categorical features into a numerical format suitable for networks, such as one-hot encoding. Additionally, we will cover how to divide data into batches for efficient processing during training and the standard practice of splitting datasets into training, validation, and test sets for model development and reliable evaluation. By the end of this chapter, you will understand how to transform raw data into a structured format that facilitates effective learning by neural networks.
2.1 Understanding Input Data Representation
2.2 Feature Scaling: Normalization and Standardization
2.3 Handling Categorical Data: Encoding Techniques
2.4 Creating Data Batches for Training
2.5 Splitting Data: Training, Validation, and Test Sets
2.6 Hands-on Practical: Preprocessing Sample Data
© 2025 ApX Machine Learning