Practical application of NumPy array fundamentals, including indexing, mathematical operations, broadcasting, and statistical functions, is essential. Engaging in practical exercises helps solidify understanding and build confidence in manipulating numerical data efficiently.This section provides hands-on problems designed to reinforce the techniques covered in this chapter. We'll work through examples that involve creating arrays, selecting data, performing calculations, and applying statistical methods, skills directly applicable to data preparation and analysis in machine learning.Problem 1: Creating and Analyzing Sensor DataImagine you have collected temperature readings (in Celsius) from three different sensors over four consecutive time points. The readings are: Sensor 1: [22.5, 23.1, 22.8, 23.5], Sensor 2: [21.8, 22.2, 22.0, 22.5], Sensor 3: [23.0, 23.3, 23.1, 23.6].Task:Create a 2D NumPy array representing this data, where each row corresponds to a sensor and each column corresponds to a time point.Calculate the average temperature for each sensor across all time points.Calculate the average temperature recorded at each time point across all sensors.Find the maximum temperature recorded overall.Solution:import numpy as np # 1. Create the 2D array sensor_data = np.array([ [22.5, 23.1, 22.8, 23.5], [21.8, 22.2, 22.0, 22.5], [23.0, 23.3, 23.1, 23.6] ]) print("Sensor Data Array:") print(sensor_data) print("-" * 20) # 2. Average temperature per sensor (across columns, axis=1) avg_per_sensor = np.mean(sensor_data, axis=1) print("Average Temperature per Sensor:") print(avg_per_sensor) print("-" * 20) # 3. Average temperature per time point (across rows, axis=0) avg_per_timepoint = np.mean(sensor_data, axis=0) print("Average Temperature per Time Point:") print(avg_per_timepoint) print("-" * 20) # 4. Overall maximum temperature max_temp = np.max(sensor_data) print(f"Overall Maximum Temperature: {max_temp:.1f}°C")Explanation:We first create the sensor_data array using np.array() with a list of lists.To calculate the average for each sensor, we use np.mean() and specify axis=1. This tells NumPy to compute the mean along the horizontal axis (across the columns for each row).Similarly, specifying axis=0 computes the mean along the vertical axis (across the rows for each column), giving the average temperature at each time point.np.max() without an axis argument finds the maximum value in the entire array.Problem 2: Data Normalization (Standard Scaling)Standard scaling (or Z-score normalization) is a common preprocessing step in machine learning. It involves transforming data such that it has a mean of 0 and a standard deviation of 1. The formula for a data point $x$ is:$$ Z = \frac{x - \mu}{\sigma} $$Where $\mu$ is the mean of the data and $\sigma$ is the standard deviation.Task:Create a 1D NumPy array with the following values: [10, 15, 12, 18, 25, 11, 16].Calculate the mean ($\mu$) and standard deviation ($\sigma$) of this array.Apply the standard scaling formula to normalize the array using NumPy operations (leverage broadcasting).Solution:import numpy as np # 1. Create the array data = np.array([10, 15, 12, 18, 25, 11, 16]) print("Original Data:") print(data) print("-" * 20) # 2. Calculate mean and standard deviation mean_val = np.mean(data) std_dev = np.std(data) print(f"Mean (μ): {mean_val:.2f}") print(f"Standard Deviation (σ): {std_dev:.2f}") print("-" * 20) # 3. Apply standard scaling using broadcasting normalized_data = (data - mean_val) / std_dev print("Normalized Data (Z-scores):") print(normalized_data) print("-" * 20) # Verification: Check mean and std dev of normalized data print(f"Mean of Normalized Data: {np.mean(normalized_data):.2f}") print(f"Std Dev of Normalized Data: {np.std(normalized_data):.2f}")Explanation:We calculate the mean and standard deviation using np.mean() and np.std().The core of the normalization happens in the line normalized_data = (data - mean_val) / std_dev. Here, mean_val (a scalar) is subtracted from every element of the data array (broadcasting). The result of this subtraction (an array) is then divided element-wise by std_dev (another scalar, again using broadcasting). This efficiently applies the Z-score formula to the entire array without needing explicit loops.The verification step shows that the resulting normalized_data has a mean very close to 0 and a standard deviation very close to 1, as expected.Problem 3: Selecting and Modifying Data with Boolean IndexingConsider a dataset representing scores of students on two different tests: scores = np.array([[85, 92], [78, 81], [91, 95], [60, 65], [72, 79]]). Each row is a student, column 0 is Test 1 score, and column 1 is Test 2 score.Task:Select the scores (both tests) for students who scored 80 or above on Test 1.Identify students who scored below 70 on either test.Suppose there was a curve applied to Test 2, adding 3 points to everyone who scored below 80 on that test. Create a new array reflecting this curve, without modifying the original scores array.Solution:import numpy as np scores = np.array([[85, 92], [78, 81], [91, 95], [60, 65], [72, 79]]) print("Original Scores:") print(scores) print("-" * 20) # 1. Select students with Test 1 score >= 80 high_scorers_test1 = scores[scores[:, 0] >= 80] print("Scores of Students with Test 1 >= 80:") print(high_scorers_test1) print("-" * 20) # 2. Identify students scoring < 70 on either test low_scorers_mask = (scores[:, 0] < 70) | (scores[:, 1] < 70) low_scorers = scores[low_scorers_mask] print("Scores of Students with < 70 on Either Test:") print(low_scorers) print("-" * 20) # 3. Apply curve to Test 2 for scores < 80 # Create a copy to avoid modifying the original array curved_scores = scores.copy() # Create a boolean mask for Test 2 scores < 80 test2_curve_mask = curved_scores[:, 1] < 80 # Add 3 points using the mask for selection on the relevant column curved_scores[test2_curve_mask, 1] += 3 print("Scores after applying curve to Test 2 (< 80):") print(curved_scores) print("-" * 20) print("Original Scores (unchanged):") print(scores)Explanation:Task 1: scores[:, 0] selects all rows (:) and the first column (0), which corresponds to Test 1 scores. The condition >= 80 creates a boolean array ([True, False, True, False, False]). Using this boolean array as an index for scores selects only the rows where the condition is True.Task 2: We create two boolean conditions: scores[:, 0] < 70 for Test 1 and scores[:, 1] < 70 for Test 2. The logical OR operator (|) combines these, resulting in a mask that is True if a student scored below 70 on at least one test. This mask is then used to select the relevant rows.Task 3: It's important to use scores.copy() to create curved_scores. Otherwise, modifications would affect the original scores array. We create a mask test2_curve_mask specifically for Test 2 scores below 80. Then, curved_scores[test2_curve_mask, 1] selects the rows indicated by the mask but only in the second column (Test 2 scores). We then use += 3 to add 3 directly to these selected elements.Problem 4: Reshaping and Basic Linear AlgebraYou are given a 1D array representing pixel values from a grayscale image snippet: pixels = np.arange(1, 13).Task:Reshape this array into a 3x4 matrix (representing 3 rows, 4 columns of pixels).Create a 4x2 matrix transform_matrix with values [[0.5, 0.5], [1, 0], [0, 1], [0.2, 0.8]].Perform matrix multiplication between the reshaped pixel matrix and the transform_matrix. What is the shape of the resulting matrix?Solution:import numpy as np pixels = np.arange(1, 13) print("Original Pixel Array:") print(pixels) print("-" * 20) # 1. Reshape the array pixel_matrix = pixels.reshape((3, 4)) print("Reshaped Pixel Matrix (3x4):") print(pixel_matrix) print("-" * 20) # 2. Create the transformation matrix transform_matrix = np.array([ [0.5, 0.5], [1, 0], [0, 1], [0.2, 0.8] ]) print("Transformation Matrix (4x2):") print(transform_matrix) print("-" * 20) # 3. Perform matrix multiplication result_matrix = np.dot(pixel_matrix, transform_matrix) # Alternative syntax: result_matrix = pixel_matrix @ transform_matrix print("Result of Matrix Multiplication (pixel_matrix @ transform_matrix):") print(result_matrix) print("-" * 20) print(f"Shape of the Resulting Matrix: {result_matrix.shape}")Explanation:np.arange(1, 13) creates an array [1, 2, ..., 12].The reshape((3, 4)) method reorganizes these 12 elements into a matrix with 3 rows and 4 columns. Note that the total number of elements must remain the same (3 * 4 = 12).Matrix multiplication requires the inner dimensions to match. We are multiplying a (3x4) matrix by a (4x2) matrix. The inner dimension (4) matches.We use np.dot() (or the @ operator) for matrix multiplication. Standard multiplication (*) would perform element-wise multiplication if shapes were compatible via broadcasting, which is not what we want here.The resulting matrix shape is determined by the outer dimensions: (3x4) @ (4x2) results in a (3x2) matrix.These exercises demonstrate how the different NumPy functionalities you've learned, creation, indexing, slicing, broadcasting, mathematical operations, and basic linear algebra, work together in practical scenarios. As you proceed to work with libraries like Pandas and Scikit-learn, you'll find that a solid grasp of these NumPy manipulations is indispensable. Continue experimenting with different array shapes, operations, and indexing techniques to build fluency.