Now that you understand the concept of a Pandas Series as a one-dimensional labeled array, let's look at the common ways to create one. The primary tool for this is the pd.Series()
constructor.
We'll need the Pandas library, which is conventionally imported with the alias pd
. If you also plan to use NumPy arrays (which work very well with Pandas), import it as np
.
import pandas as pd
import numpy as np
The most straightforward way to create a Series is from a standard Python list. Pandas will automatically create a default integer index starting from 0 if you don't specify one.
# Create a simple Python list
data_list = [10, 20, 30, 40, 50]
# Create a Pandas Series from the list
series_from_list = pd.Series(data_list)
# Print the Series
print(series_from_list)
Output:
0 10
1 20
2 30
3 40
4 50
dtype: int64
Notice the output shows two columns: the index on the left (0 to 4) and the data values on the right. Pandas also infers the data type (dtype: int64
in this case, representing 64-bit integers).
While the default integer index is useful, the real power of a Series comes from its ability to use meaningful labels for the index. You can provide a list or array of labels using the index
argument during creation. The index list must be the same length as the data list.
# Data list
data_list = [100, 200, 300, 400]
# Custom index labels
index_labels = ['alpha', 'beta', 'gamma', 'delta']
# Create a Series with custom index
series_custom_index = pd.Series(data=data_list, index=index_labels)
# Print the Series
print(series_custom_index)
Output:
alpha 100
beta 200
gamma 300
delta 400
dtype: int64
Now, instead of 0, 1, 2, 3, the index consists of the string labels 'alpha', 'beta', 'gamma', and 'delta'. This makes accessing specific data points more intuitive, as we'll see in later sections.
You can just as easily create a Series from a NumPy array. This is very common since data often originates from numerical computations performed using NumPy. The process is identical to using a list.
# Create a NumPy array
numpy_array = np.array([5.5, 6.6, 7.7, 8.8])
# Create a Series from the NumPy array
series_from_numpy = pd.Series(numpy_array)
# Print the Series
print(series_from_numpy)
Output:
0 5.5
1 6.6
2 7.7
3 8.8
dtype: float64
Again, Pandas creates a default integer index and infers the data type (float64 in this case). You can also provide a custom index when creating a Series from a NumPy array, just like with lists.
Another convenient method is creating a Series directly from a Python dictionary. In this case, Pandas uses the dictionary keys as the index labels and the dictionary values as the Series data. The order of the Series elements will generally follow the insertion order of the dictionary (for Python 3.7+).
# Create a Python dictionary
data_dict = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
# Create a Series from the dictionary
series_from_dict = pd.Series(data_dict)
# Print the Series
print(series_from_dict)
Output:
Ohio 35000
Texas 71000
Oregon 16000
Utah 5000
dtype: int64
This method is particularly useful when your data is already structured in a key-value format.
You can also explicitly specify an index when creating from a dictionary. If an index label is provided but doesn't exist as a key in the dictionary, Pandas will insert a NaN
(Not a Number) value, which is the standard way Pandas represents missing data. If the dictionary contains keys not present in the specified index, those key-value pairs are ignored.
# Dictionary
data_dict = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
# Explicit index - includes 'California' (not in dict) and excludes 'Utah'
states = ['California', 'Ohio', 'Oregon', 'Texas']
# Create Series with explicit index
series_explicit_index = pd.Series(data_dict, index=states)
# Print the Series
print(series_explicit_index)
Output:
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
dtype: float64
Notice that 'California' has a NaN
value because it wasn't in data_dict
. Also, 'Utah' from the original dictionary is excluded because it wasn't in the states
index list. The dtype
changed to float64
because NaN
is considered a float value.
These methods cover the most common ways to instantiate Pandas Series objects. As you work with data, you'll often find yourself creating Series from existing data structures like lists, dictionaries, or NumPy arrays as a first step in your analysis workflow.
© 2025 ApX Machine Learning