Now that you know how to create NumPy arrays, the next step is learning how to access and modify their contents. Just like with standard Python lists, you often need to select specific elements, rows, columns, or subsections of your data. NumPy provides flexible and efficient indexing and slicing mechanisms for this purpose. Understanding these techniques is fundamental for data manipulation in machine learning tasks, such as selecting features or training examples.
NumPy arrays use zero-based indexing, meaning the first element is at index 0, the second at index 1, and so on.
For a one-dimensional array, you access an element by specifying its index in square brackets []
.
import numpy as np
# Create a 1D array
vector = np.arange(5, 11) # Creates array([5, 6, 7, 8, 9, 10])
print("Original vector:", vector)
# Access the first element (index 0)
first_element = vector[0]
print("First element:", first_element) # Output: 5
# Access the third element (index 2)
third_element = vector[2]
print("Third element:", third_element) # Output: 7
# Access the last element using negative indexing
last_element = vector[-1]
print("Last element:", last_element) # Output: 10
Negative indices count from the end of the array, so -1
refers to the last element, -2
to the second-to-last, and so forth.
For two-dimensional arrays (matrices), you need to specify both the row index and the column index to access a single element. The standard way to do this in NumPy is using a single pair of square brackets with the indices separated by a comma: [row, column]
.
# Create a 2D array (3 rows, 4 columns)
matrix = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
])
print("Original matrix:\n", matrix)
# Access the element at row 0, column 1
element_0_1 = matrix[0, 1]
print("Element at [0, 1]:", element_0_1) # Output: 2
# Access the element at row 2, column 3
element_2_3 = matrix[2, 3]
print("Element at [2, 3]:", element_2_3) # Output: 12
# Access the element at row 1, column 0
element_1_0 = matrix[1, 0]
print("Element at [1, 0]:", element_1_0) # Output: 5
While you might sometimes see notation like matrix[row][column]
, using matrix[row, column]
is generally preferred as it's more efficient and syntactically clearer for multi-dimensional arrays.
Slicing allows you to select a range of elements from an array, creating a subarray. The syntax is similar to Python list slicing: start:stop:step
. Remember that the start
index is inclusive, and the stop
index is exclusive. The step
determines the interval between elements. If omitted, start
defaults to 0, stop
defaults to the array length, and step
defaults to 1.
# Create a 1D array
vector = np.arange(10, 20) # Creates array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
print("Original vector:", vector)
# Get elements from index 2 up to (but not including) index 5
slice1 = vector[2:5]
print("Slice [2:5]:", slice1) # Output: [12 13 14]
# Get elements from the beginning up to index 4
slice2 = vector[:4]
print("Slice [:4]:", slice2) # Output: [10 11 12 13]
# Get elements from index 6 to the end
slice3 = vector[6:]
print("Slice [6:]:", slice3) # Output: [16 17 18 19]
# Get every second element from the entire array
slice4 = vector[::2]
print("Slice [::2]:", slice4) # Output: [10 12 14 16 18]
# Get every second element from index 1 to index 7
slice5 = vector[1:8:2]
print("Slice [1:8:2]:", slice5) # Output: [11 13 15 17]
You can slice 2D arrays along both dimensions by providing slices for rows and columns, separated by a comma: array[row_slice, column_slice]
.
# Create a 2D array (4 rows, 5 columns)
matrix = np.array([
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]
])
print("Original matrix:\n", matrix)
# Get the first two rows (rows 0 and 1) and all columns
slice_rows = matrix[0:2, :] # or matrix[:2, :]
print("First two rows:\n", slice_rows)
# Output:
# [[10 11 12 13 14]
# [15 16 17 18 19]]
# Get all rows, but only columns 1 and 2 (indices 1 up to 3)
slice_cols = matrix[:, 1:3]
print("Columns 1 and 2:\n", slice_cols)
# Output:
# [[11 12]
# [16 17]
# [21 22]
# [26 27]]
# Get a submatrix: rows 1 and 2, columns 2, 3, and 4
sub_matrix = matrix[1:3, 2:]
print("Submatrix (rows 1-2, cols 2-end):\n", sub_matrix)
# Output:
# [[17 18 19]
# [22 23 24]]
# Get a single row (returns a 1D array)
row_1 = matrix[1, :] # Equivalent to matrix[1]
print("Row 1:", row_1) # Output: [15 16 17 18 19]
# Get a single column (returns a 1D array)
col_3 = matrix[:, 3]
print("Column 3:", col_3) # Output: [13 18 23 28]
A significant aspect of NumPy slicing is that basic slices (using start:stop:step
) return views of the original array data, not copies. This means the slice is just a different way of looking at the same underlying data in memory. Modifying elements in a view will also modify the original array.
# Create an array
original_array = np.arange(5) # array([0, 1, 2, 3, 4])
print("Original array:", original_array)
# Create a slice (a view)
array_slice = original_array[1:4] # View containing elements [1, 2, 3]
print("Slice (view):", array_slice)
# Modify an element in the slice
array_slice[1] = 99 # Modify the second element of the slice (which corresponds to index 2 of original)
print("Modified slice:", array_slice) # Output: [ 1 99 3]
# Observe the change in the original array
print("Original array after slice modification:", original_array) # Output: [ 0 1 99 3 4]
This behavior is designed for efficiency, especially with large datasets, as it avoids unnecessary data duplication. However, if you need an independent copy of the sliced data that won't affect the original, you must explicitly use the .copy()
method.
# Create an array
original_array = np.arange(5) # array([0, 1, 2, 3, 4])
print("Original array:", original_array)
# Create a copy of a slice
array_copy = original_array[1:4].copy()
print("Slice (copy):", array_copy)
# Modify an element in the copy
array_copy[1] = 99
print("Modified copy:", array_copy) # Output: [ 1 99 3]
# The original array remains unchanged
print("Original array after copy modification:", original_array) # Output: [0 1 2 3 4]
NumPy also supports more advanced indexing techniques. While basic slicing extracts contiguous blocks or regularly spaced elements, integer array indexing and boolean array indexing allow for selecting arbitrary elements based on lists of indices or conditions.
You can use lists or other NumPy arrays of integers as indices. This lets you pick elements, rows, or columns in any order you choose.
# Create a 1D array
vector = np.array([10, 20, 30, 40, 50, 60])
print("Original vector:", vector)
# Select elements at indices 1, 3, and 0
selected_elements = vector[[1, 3, 0]]
print("Selected elements [1, 3, 0]:", selected_elements) # Output: [20 40 10]
# Create a 2D array
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]
])
print("Original matrix:\n", matrix)
# Select rows 0, 2, and 1 in that order
selected_rows = matrix[[0, 2, 1], :] # Note the :, selects all columns for these rows
print("Selected rows [0, 2, 1]:\n", selected_rows)
# Output:
# [[1 2 3]
# [7 8 9]
# [4 5 6]]
# Select specific elements using pairs of indices (row, col)
selected_cells = matrix[[0, 1, 3], [2, 0, 1]] # Selects (0,2), (1,0), (3,1)
print("Selected specific cells:", selected_cells) # Output: [ 3 4 11]
Unlike basic slicing, integer array indexing always returns a copy of the data, not a view.
You can use boolean arrays (containing True
and False
) to index an array. This is extremely useful for selecting elements that satisfy a certain condition. The boolean array must have the same shape as the dimension being indexed.
# Create a 1D array
data = np.array([1, 5, 2, 8, 3, 7, 4, 6])
print("Original data:", data)
# Create a boolean condition: elements greater than 4
condition = data > 4
print("Boolean condition (data > 4):", condition)
# Output: [False True False True False True False True]
# Select elements where the condition is True
selected_data = data[condition]
print("Elements > 4:", selected_data) # Output: [5 8 7 6]
# You can write this more compactly
selected_data_compact = data[data > 4]
print("Elements > 4 (compact):", selected_data_compact) # Output: [5 8 7 6]
# Example with a 2D array
matrix = np.array([
[1, 6, 2],
[7, 3, 8],
[4, 9, 5]
])
print("Original matrix:\n", matrix)
# Select elements greater than 5
selected_matrix_elements = matrix[matrix > 5]
print("Matrix elements > 5:", selected_matrix_elements) # Output: [6 7 8 9] (returns a flattened 1D array)
# Select rows where the first element is greater than 3
rows_condition = matrix[:, 0] > 3 # Condition based on the first column
print("Rows where first element > 3:\n", matrix[rows_condition, :])
# Output:
# [[ 7 3 8]
# [ 4 9 5]]
Like integer array indexing, boolean array indexing also returns a copy of the data.
Mastering indexing and slicing is essential for effectively working with data in NumPy. These techniques allow you to precisely access, modify, and filter the array elements you need for analysis and machine learning model inputs.
© 2025 ApX Machine Learning