To create scalars, vectors, and matrices, fundamental structures for data, Python's NumPy library is essential. NumPy is the standard library for numerical and scientific computing in Python, providing powerful tools for working with multi-dimensional arrays, which is exactly what vectors and matrices are.
If you followed the setup guide in the previous section, you should have NumPy installed. The first step in any script that uses it is to import the library. The standard convention, which you will see in almost all data science and machine learning code, is to import it with the alias np.
import numpy as np
This simple line gives us access to all of NumPy's functions through the np prefix.
A vector, as we've learned, is an ordered list of numbers. In machine learning, a vector often represents a single data point or a set of features. For example, we could represent a house with three features, square footage, number of bedrooms, and number of bathrooms, as a vector.
To create a vector in NumPy, we use the np.array() function and pass it a Python list.
# A vector representing a house with features:
# [square_footage, num_bedrooms, num_bathrooms]
house_vector = np.array([1500, 3, 2.5])
print(house_vector)
Output:
[1500. 3. 2.5]
This creates a one-dimensional NumPy array. We can verify its dimensions using the .shape attribute, which is an important tool for understanding the structure of our data.
print(house_vector.shape)
Output:
(3,)
The output (3,) indicates that this is a one-dimensional array with three elements. This is often called a row vector by convention.
In linear algebra, the distinction between row vectors and column vectors is significant. A column vector is represented as a matrix with multiple rows but only one column. To create a column vector in NumPy, you must use a nested list, where each inner list represents a row.
# Creating a column vector
house_column_vector = np.array([[1500], [3], [2.5]])
print(house_column_vector)
Output:
[[1500. ]
[ 3. ]
[ 2.5]]
Now, let's check its shape.
print(house_column_vector.shape)
Output:
(3, 1)
The shape (3, 1) confirms this is a two-dimensional object with 3 rows and 1 column. For now, most of our work will use 1D arrays, but knowing how to create column vectors is necessary for future operations like matrix-vector multiplication.
A matrix organizes data in a two-dimensional grid of rows and columns. In machine learning, a matrix typically represents an entire dataset, where each row is a data sample (a vector) and each column is a specific feature across all samples.
Let's expand our house example to a dataset of three houses. We create a matrix by passing a list of lists to np.array().
# A dataset of three houses
# Each row is a house, each column is a feature
housing_data_matrix = np.array([
[1500, 3, 2.5], # House 1
[2100, 4, 3], # House 2
[1200, 2, 2] # House 3
])
print(housing_data_matrix)
Output:
[[1500. 3. 2.5]
[2100. 4. 3. ]
[1200. 2. 2. ]]
Checking the shape of our matrix gives us a clear picture of its structure.
print(housing_data_matrix.shape)
Output:
(3, 3)
The shape (3, 3) tells us we have a matrix with 3 rows and 3 columns, perfectly matching our dataset of three houses with three features each. This confirms the connection we made earlier: a matrix is effectively a stack of vectors.
Typing out every number is not always practical. NumPy provides several convenient functions for creating common types of matrices.
It is often useful to create arrays initialized with placeholder values, such as all zeros or all ones. The np.zeros() and np.ones() functions are used for this. You provide them with a tuple specifying the desired shape.
# A 2x4 matrix of all zeros
zeros_matrix = np.zeros((2, 4))
print("Zeros Matrix:")
print(zeros_matrix)
# A 3x2 matrix of all ones
ones_matrix = np.ones((3, 2))
print("\nOnes Matrix:")
print(ones_matrix)
Output:
Zeros Matrix:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]]
Ones Matrix:
[[1. 1.]
[1. 1.]
[1. 1.]]
For creating test data or sequences, np.arange() is very useful. It creates an array with evenly spaced values within a given interval. It works much like Python's built-in range() function.
# Create a vector with numbers from 0 up to (but not including) 9
sequence_vector = np.arange(9)
print(sequence_vector)
Output:
[0 1 2 3 4 5 6 7 8]
This function becomes even more powerful when combined with .reshape(), which changes the shape of an array without changing its data. We can create a sequence and then immediately structure it as a matrix.
# Create a 3x3 matrix with values from 0 to 8
sequence_matrix = np.arange(9).reshape(3, 3)
print(sequence_matrix)
Output:
[[0 1 2]
[3 4 5]
[6 7 8]]
This pattern is a fast and common way to generate sample matrices for testing algorithms.
With these tools, you can now translate abstract mathematical objects into concrete NumPy arrays. You can represent a single data point as a vector and an entire dataset as a matrix. In the next chapter, we will take these arrays and start performing fundamental linear algebra operations on them.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with