While creating NumPy arrays directly from Python lists is a useful starting point, it's often more convenient and efficient to generate arrays using NumPy's specialized functions. These functions allow you to create arrays with specific structures or initial values without first constructing a Python list. Let's look at some of the most frequently used ones.
arange
Similar to Python's built-in range
function, NumPy's arange
function creates an array containing a sequence of evenly spaced values within a given interval. However, unlike range
which produces a generator, arange
returns a NumPy array directly.
The basic syntax is np.arange(start, stop, step)
, where:
start
: The beginning of the interval (inclusive). Defaults to 0 if not provided.stop
: The end of the interval (exclusive).step
: The spacing between values. Defaults to 1.import numpy as np
# Create an array from 0 up to (but not including) 5
arr1 = np.arange(5)
print(arr1)
# Output: [0 1 2 3 4]
# Create an array from 2 up to (but not including) 8
arr2 = np.arange(2, 8)
print(arr2)
# Output: [2 3 4 5 6 7]
# Create an array from 1 to 10 with a step of 2
arr3 = np.arange(1, 10, 2)
print(arr3)
# Output: [1 3 5 7 9]
Notice that arange
, like Python's range
, does not include the stop
value in the result. Also, arange
can use floating-point steps, but be cautious due to potential floating-point inaccuracies. For non-integer steps where the exact number of points is more important, linspace
(covered next) is often preferred.
Often, you need to initialize an array of a specific size with placeholder values, typically zeros or ones. NumPy provides zeros
and ones
for this purpose.
np.zeros(shape, dtype=float)
: Creates an array filled with zeros.np.ones(shape, dtype=float)
: Creates an array filled with ones.The shape
argument is a tuple specifying the dimensions of the array (e.g., (3,)
for a 1D array of size 3, (2, 4)
for a 2D array with 2 rows and 4 columns). The dtype
argument is optional and specifies the data type (defaulting to float64
).
# Create a 1D array of 4 zeros (default dtype is float)
zeros_arr_1d = np.zeros(4)
print(zeros_arr_1d)
# Output: [0. 0. 0. 0.]
# Create a 2x3 array of ones with integer type
ones_arr_2d_int = np.ones((2, 3), dtype=np.int64)
print(ones_arr_2d_int)
# Output:
# [[1 1 1]
# [1 1 1]]
# Check the data type
print(ones_arr_2d_int.dtype)
# Output: int64
linspace
Sometimes, you need an array containing a specific number of evenly spaced points between a start and end value. This is where linspace
is useful.
The syntax is np.linspace(start, stop, num=50)
, where:
start
: The starting value of the sequence (inclusive).stop
: The ending value of the sequence (inclusive by default).num
: The number of samples to generate. Defaults to 50.Unlike arange
, linspace
includes the stop
value in the array.
# Create an array with 5 evenly spaced values between 0 and 1 (inclusive)
lin_arr1 = np.linspace(0, 1, 5)
print(lin_arr1)
# Output: [0. 0.25 0.5 0.75 1. ]
# Create an array with 11 evenly spaced values between 0 and 10
lin_arr2 = np.linspace(0, 10, 11)
print(lin_arr2)
# Output: [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
# You can exclude the endpoint if needed
lin_arr3 = np.linspace(0, 1, 5, endpoint=False)
print(lin_arr3)
# Output: [0. 0.2 0.4 0.6 0.8]
linspace
is particularly handy for generating coordinates for plotting or simulations.
eye
An identity matrix is a square matrix (number of rows equals number of columns) with ones on the main diagonal (from top-left to bottom-right) and zeros everywhere else. NumPy's eye
function creates these.
The syntax is np.eye(N, dtype=float)
, where N
is the number of rows (and columns).
# Create a 3x3 identity matrix
identity_matrix = np.eye(3)
print(identity_matrix)
# Output:
# [[1. 0. 0.]
# [0. 1. 0.]
# [0. 0. 1.]]
# Create a 4x4 identity matrix with integer type
identity_matrix_int = np.eye(4, dtype=int)
print(identity_matrix_int)
# Output:
# [[1 0 0 0]
# [0 1 0 0]
# [0 0 1 0]
# [0 0 0 1]]
Identity matrices are fundamental in linear algebra operations.
full
If you need an array of a given shape filled entirely with a constant value other than 0 or 1, you can use np.full
.
The syntax is np.full(shape, fill_value, dtype=None)
.
# Create a 2x4 array filled with the number 7
full_arr = np.full((2, 4), 7)
print(full_arr)
# Output:
# [[7 7 7 7]
# [7 7 7 7]]
# Create a 1D array of size 3 filled with pi
pi_arr = np.full(3, np.pi)
print(pi_arr)
# Output: [3.14159265 3.14159265 3.14159265]
The data type is inferred from the fill_value
unless explicitly specified with dtype
.
NumPy also includes a powerful submodule, numpy.random
, for creating arrays with random numbers drawn from various distributions. Here are a few common examples:
np.random.rand(d0, d1, ..., dn)
: Creates an array of the given shape with random samples from a uniform distribution over [0,1).np.random.randn(d0, d1, ..., dn)
: Creates an array of the given shape with random samples from the standard normal distribution (mean 0, variance 1).np.random.randint(low, high=None, size=None, dtype=int)
: Creates an array of the specified size with random integers from low
(inclusive) to high
(exclusive).# Create a 2x3 array with random values between 0 and 1
rand_arr = np.random.rand(2, 3)
print(rand_arr)
# Example Output (will vary):
# [[0.11150118 0.38348479 0.45066311]
# [0.86726997 0.13023643 0.80802871]]
# Create a 1D array of size 4 with samples from standard normal distribution
randn_arr = np.random.randn(4)
print(randn_arr)
# Example Output (will vary):
# [-1.04782338 0.88233694 -0.22512731 0.280441 ]
# Create a 1D array of 5 random integers between 10 (inclusive) and 20 (exclusive)
randint_arr = np.random.randint(10, 20, size=5)
print(randint_arr)
# Example Output (will vary):
# [15 11 18 10 13]
These random number functions are essential for simulations, statistical modeling, and initializing parameters in machine learning algorithms.
These built-in functions provide flexible and efficient ways to create NumPy arrays for various computational tasks, forming the foundation for many numerical workflows in Python.
© 2025 ApX Machine Learning