Having explored the theoretical foundations of the Perceptron, including its structure based on the artificial neuron and its limitation to linearly separable problems, it's time to put theory into practice. This hands-on exercise guides you through implementing a simple Perceptron from scratch using Python and the NumPy library. Building this foundational model will solidify your understanding of how inputs, weights, bias, and the learning rule interact to achieve classification.
We'll tackle a classic linearly separable problem: the logical AND gate. An AND gate outputs 1 only if both of its inputs are 1, otherwise it outputs 0.
The truth table for the AND gate is:
Input 1 (x1) | Input 2 (x2) | Output (y) |
---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
Our goal is to train a Perceptron that takes (x1,x2) as input and correctly predicts y. Because we can draw a straight line to separate the points where y=1 from the points where y=0 in a 2D plot, this problem is linearly separable and solvable by a single Perceptron.
Recall the Perceptron learning steps:
We repeat steps 2-4 for all training samples multiple times (epochs) until the model converges (makes correct predictions for all samples) or a maximum number of epochs is reached.
Let's implement this. We'll use NumPy for efficient numerical operations.
import numpy as np
# Define the AND gate dataset
# Inputs (X) including a bias term (column of 1s) for convenience later,
# but we'll handle bias separately in the code below for clarity.
X = np.array([
[0, 0],
[0, 1],
[1, 0],
[1, 1]
])
# Outputs (y)
y = np.array([0, 0, 0, 1])
# Initialize weights and bias
# Two inputs, so two weights. Initialize to small random values.
np.random.seed(42) # for reproducibility
weights = np.random.rand(2) * 0.1 # e.g., [0.037, 0.095]
bias = np.random.rand(1) * 0.1 # e.g., [0.073]
# Define the step activation function
def step_function(z):
return np.where(z >= 0, 1, 0)
# Set learning parameters
learning_rate = 0.1
epochs = 50 # Number of passes through the entire dataset
print(f"Initial weights: {weights}, Initial bias: {bias[0]:.3f}")
print("-" * 30)
# Training loop
for epoch in range(epochs):
errors = 0
for i in range(len(X)):
# Get current input sample and target
inputs = X[i]
target = y[i]
# 1. Calculate weighted sum (activation)
z = np.dot(inputs, weights) + bias
# 2. Make prediction
prediction = step_function(z)
# 3. Calculate error
error = target - prediction
# 4. Update weights and bias if error is non-zero
if error != 0:
errors += 1
weights += learning_rate * error * inputs
bias += learning_rate * error
# Print progress (optional)
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch+1}/{epochs}, Errors: {errors}, Weights: [{weights[0]:.3f}, {weights[1]:.3f}], Bias: {bias[0]:.3f}")
# Check for convergence (no errors in an epoch)
if errors == 0 and epoch > 0:
print(f"\nConvergence reached at epoch {epoch+1}.")
break
print("-" * 30)
print(f"Final weights: [{weights[0]:.3f}, {weights[1]:.3f}]")
print(f"Final bias: {bias[0]:.3f}")
# Test the trained Perceptron
print("\nTesting the trained Perceptron:")
for i in range(len(X)):
inputs = X[i]
target = y[i]
z = np.dot(inputs, weights) + bias
prediction = step_function(z)
print(f"Input: {inputs}, Target: {target}, Prediction: {prediction[0]}")
Running the code above should produce output similar to this (exact weights might vary slightly due to random initialization):
Initial weights: [0.03745401 0.09507143], Initial bias: 0.073
------------------------------
Epoch 10/50, Errors: 0, Weights: [0.137, 0.095], Bias: -0.127
Convergence reached at epoch 10.
------------------------------
Final weights: [0.137, 0.095]
Final bias: -0.127
Testing the trained Perceptron:
Input: [0 0], Target: 0, Prediction: 0
Input: [0 1], Target: 0, Prediction: 0
Input: [1 0], Target: 0, Prediction: 0
Input: [1 1], Target: 1, Prediction: 1
You can see the weights and bias adjusting over the epochs. The number of errors decreases until it reaches zero, indicating the Perceptron has learned to correctly classify all input patterns for the AND gate. The final test confirms the predictions match the target outputs.
For a 2D problem like this, we can visualize the decision boundary learned by the Perceptron. The boundary is the line where the weighted sum equals zero: w1x1+w2x2+b=0. We can rewrite this to plot x2 as a function of x1: x2=(−w1x1−b)/w2.
{"layout": {"title": "Perceptron Decision Boundary for AND Gate", "xaxis": {"title": "Input 1 (x1)", "range": [-0.5, 1.5]}, "yaxis": {"title": "Input 2 (x2)", "range": [-0.5, 1.5]}, "legend": {"title": "Output (y)"}}, "data": [{"x": [0, 0, 1, 1], "y": [0, 1, 0, 1], "mode": "markers", "marker": {"color": ["#fa5252", "#fa5252", "#fa5252", "#1c7ed6"], "size": 12, "symbol": "circle"}, "type": "scatter", "name": "Data Points (0 or 1)"}, {"x": [-0.5, 1.5], "y": [(-0.137 * -0.5 - (-0.127)) / 0.095, (-0.137 * 1.5 - (-0.127)) / 0.095], "mode": "lines", "line": {"color": "#37b24d", "width": 2}, "type": "scatter", "name": "Decision Boundary"}]}
The plot shows the four input points for the AND gate. Points colored red represent an output of 0, and the blue point represents an output of 1. The green line is the decision boundary (w1x1+w2x2+b=0) learned by the Perceptron. All points on one side of the line are classified as 0, and points on the other side are classified as 1.
This practical implementation demonstrates the core mechanism of a Perceptron. While simple, it forms the basis for understanding how weights are adjusted during learning. As we saw earlier, this model has limitations (it cannot solve the XOR problem). This motivates the move towards Multi-Layer Perceptrons (MLPs), which add hidden layers to handle more complex, non-linearly separable patterns, as we will explore in subsequent chapters.
© 2025 ApX Machine Learning