Let's put the theory into practice. We'll implement the forward propagation steps for a small feedforward neural network using Python and NumPy. This example will solidify your understanding of how input data travels through the network to produce an output.
Imagine a network designed for a binary classification task. It takes 2 input features, has one hidden layer with 3 neurons (using the ReLU activation function), and one output neuron (using the Sigmoid activation function to produce a probability between 0 and 1).
Here's a visualization of our simple network:
Network architecture: 2 input neurons, 3 hidden neurons with ReLU activation, 1 output neuron with Sigmoid activation.
First, we need NumPy for numerical operations. We'll also define our network parameters (weights and biases) and some sample input data. For reproducibility, we'll use fixed values for weights and biases. In a real training scenario, these would be initialized randomly and then learned.
import numpy as np
# --- Activation Functions ---
def sigmoid(z):
"""Computes the sigmoid activation."""
return 1 / (1 + np.exp(-z))
def relu(z):
"""Computes the ReLU activation."""
return np.maximum(0, z)
# --- Network Parameters ---
# Weights connecting Input Layer to Hidden Layer (shape: features x hidden_neurons)
W1 = np.array([[ 0.5, -0.2, 0.8],
[-0.3, 0.7, -0.1]]) # (2x3)
# Biases for Hidden Layer (shape: 1 x hidden_neurons)
b1 = np.array([[0.1, -0.4, 0.2]]) # (1x3)
# Weights connecting Hidden Layer to Output Layer (shape: hidden_neurons x output_neurons)
W2 = np.array([[ 0.6],
[-0.4],
[ 0.9]]) # (3x1)
# Bias for Output Layer (shape: 1 x output_neurons)
b2 = np.array([[-0.1]]) # (1x1)
# --- Sample Input Data ---
# A single data point with 2 features (shape: 1 x features)
X = np.array([[0.8, 0.2]]) # (1x2)
print("Input X (1x2):\n", X)
print("\nWeights W1 (2x3):\n", W1)
print("Biases b1 (1x3):\n", b1)
print("\nWeights W2 (3x1):\n", W2)
print("Biases b2 (1x1):\n", b2)
We compute the weighted sum of inputs plus the bias for the hidden layer. Using matrix multiplication, this is Z1=X⋅W1+b1.
# Calculate the linear combination for the hidden layer
Z1 = np.dot(X, W1) + b1
print("Shape of X:", X.shape)
print("Shape of W1:", W1.shape)
print("Shape of b1:", b1.shape)
print("\nLinear combination Z1 (X * W1 + b1) (1x3):\n", Z1)
print("Shape of Z1:", Z1.shape)
The result Z1 contains the input values for the activation function of each neuron in the hidden layer.
Now, we apply the ReLU activation function element-wise to Z1 to get the hidden layer's output, A1=ReLU(Z1).
# Apply ReLU activation function
A1 = relu(Z1)
print("Hidden Layer Activation A1 = ReLU(Z1) (1x3):\n", A1)
print("Shape of A1:", A1.shape)
A1 represents the output signals from the hidden layer neurons. Notice how any negative values in Z1 have been replaced by 0.
Next, we compute the weighted sum for the output layer using the activations from the hidden layer (A1) as input: Z2=A1⋅W2+b2.
# Calculate the linear combination for the output layer
Z2 = np.dot(A1, W2) + b2
print("Shape of A1:", A1.shape)
print("Shape of W2:", W2.shape)
print("Shape of b2:", b2.shape)
print("\nLinear combination Z2 (A1 * W2 + b2) (1x1):\n", Z2)
print("Shape of Z2:", Z2.shape)
Z2 is the input to the final activation function in the output layer.
Finally, we apply the Sigmoid activation function to Z2 to get the network's final output (prediction), A2=Sigmoid(Z2).
# Apply Sigmoid activation function
A2 = sigmoid(Z2)
print("Output Layer Activation (Prediction) A2 = Sigmoid(Z2) (1x1):\n", A2)
print("Shape of A2:", A2.shape)
The value A2 is the network's prediction for the input X. Since we used a Sigmoid function, this value is between 0 and 1, often interpreted as a probability in classification tasks. For our specific input [[0.8, 0.2]]
and the defined weights/biases, the network predicts approximately 0.58
.
We can wrap these steps into a reusable function:
def forward_propagation(X, W1, b1, W2, b2):
"""
Performs forward propagation for a 2-layer network.
Args:
X (np.array): Input data (batch_size x num_features).
W1 (np.array): Weights from input to hidden layer (num_features x num_hidden).
b1 (np.array): Biases for hidden layer (1 x num_hidden).
W2 (np.array): Weights from hidden to output layer (num_hidden x num_output).
b2 (np.array): Biases for output layer (1 x num_output).
Returns:
tuple: (A1, A2) where A1 is the hidden layer activation and A2 is the output prediction.
"""
# Hidden Layer
Z1 = np.dot(X, W1) + b1
A1 = relu(Z1)
# Output Layer
Z2 = np.dot(A1, W2) + b2
A2 = sigmoid(Z2)
return A1, A2
# --- Test the function with our data ---
hidden_output, final_prediction = forward_propagation(X, W1, b1, W2, b2)
print("\n--- Using the forward_propagation function ---")
print("Input X:\n", X)
print("Hidden Layer Output (A1):\n", hidden_output)
print("Final Prediction (A2):\n", final_prediction)
This function performs the complete forward pass. During training, this output (A2) would be compared against the true label using a loss function, and the difference would guide the backpropagation process to update W1, b1, W2, and b2.
You have now successfully implemented the forward propagation mechanism, calculating how a neural network generates a prediction from a given input and set of parameters. This is a fundamental building block for understanding how networks operate and learn.
© 2025 ApX Machine Learning