After loading and performing initial manipulations like shifting and calculating rolling statistics, the next logical step in exploring time series data is visualization. Plotting your data is often the quickest way to gain intuition about its underlying structure, including identifying potential trends, seasonal patterns, outliers, or structural breaks. Relying solely on summary statistics can be misleading; a visual inspection provides invaluable context.
We'll primarily use Matplotlib and Seaborn, often leveraging the built-in plotting capabilities of Pandas DataFrames and Series which use Matplotlib under the hood. Since you're expected to have familiarity with basic plotting in Python, we'll focus on interpretations specific to time series.
The most fundamental visualization is a simple line plot, with time on the x-axis and the observed values on the y-axis. This plot forms the basis for identifying the core components discussed earlier.
Let's assume you have a Pandas Series ts_data
with a DatetimeIndex. Creating a basic plot is straightforward:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Sample data creation (replace with your actual data)
date_rng = pd.date_range(start='2020-01-01', end='2022-12-31', freq='M')
data = np.sin(np.linspace(0, 2 * np.pi * 3, len(date_rng))) * 10 + \
np.linspace(5, 15, len(date_rng)) + \
np.random.randn(len(date_rng)) * 2 + 20
ts_data = pd.Series(data, index=date_rng, name='Value')
# Create the plot
plt.figure(figsize=(12, 6)) # Set figure size for better readability
plt.plot(ts_data.index, ts_data.values)
# Add labels and title for clarity
plt.xlabel("Date")
plt.ylabel("Value")
plt.title("Basic Time Series Plot")
plt.grid(True, linestyle='--', alpha=0.6) # Add a grid
plt.tight_layout() # Adjust layout
plt.show()
When examining this plot, look for:
A typical time series plot showing monthly values over three years. An upward trend and an annual seasonal pattern are visible.
As introduced in the previous section (shifting-lagging-rolling
), plotting rolling statistics like the mean and standard deviation alongside the original time series is a useful technique, particularly for visually assessing stationarity (which we'll cover formally in Chapter 2). A non-constant rolling mean suggests a trend, while a non-constant rolling standard deviation indicates changing variance (heteroscedasticity).
# Calculate rolling mean and standard deviation
rolling_mean = ts_data.rolling(window=6).mean() # Example: 6-month rolling mean
rolling_std = ts_data.rolling(window=6).std() # Example: 6-month rolling std
# Plot original data and rolling statistics
plt.figure(figsize=(12, 6))
plt.plot(ts_data.index, ts_data.values, color='#4dabf7', label='Original Data')
plt.plot(rolling_mean.index, rolling_mean.values, color='#f76707', label='Rolling Mean (6 Months)')
plt.plot(rolling_std.index, rolling_std.values, color='#ae3ec9', label='Rolling Std Dev (6 Months)')
plt.xlabel("Date")
plt.ylabel("Value")
plt.title("Time Series with Rolling Mean and Standard Deviation")
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
Observing how these rolling statistics evolve gives clues about the stability of the series' properties over time. A constantly increasing rolling mean clearly indicates a trend.
While line plots show seasonality over time, box plots grouped by seasonal periods (e.g., month, quarter) can provide a clearer comparison of distributions across these periods. They help visualize the typical value and spread for each season.
To create these, you typically extract the relevant time period (like the month number) from the DatetimeIndex and use it for grouping.
import seaborn as sns
# Ensure ts_data is a Pandas Series with a DatetimeIndex
df = pd.DataFrame({'Value': ts_data})
df['Month'] = df.index.strftime('%b') # Get month abbreviation (Jan, Feb, ...)
df['Year'] = df.index.year
# Optional: Order months chronologically if needed
month_order = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
plt.figure(figsize=(12, 6))
sns.boxplot(x='Month', y='Value', data=df, order=month_order, palette='Blues') # Use Seaborn for easy boxplots
plt.xlabel("Month")
plt.ylabel("Value")
plt.title("Monthly Distribution of Time Series Values (Box Plot)")
plt.grid(True, axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
In a monthly box plot:
These visualizations are complementary. The line plot gives the overall temporal flow, while rolling statistics and box plots help dissect specific properties like trend stability and seasonality. Effective visualization is not about creating a single "perfect" plot, but rather using multiple angles to understand the rich structure often present in time-dependent data before proceeding to more formal modeling.
© 2025 ApX Machine Learning