All Courses

Hands-on Practical: Building a Simple Perceptron Model

Having explored the theoretical foundations of the Perceptron, including its structure based on the artificial neuron and its limitation to linearly separable problems, it's time to put theory into practice. This hands-on exercise guides you through implementing a simple Perceptron from scratch using Python and the NumPy library. Building this foundational model will solidify your understanding of how inputs, weights, bias, and the learning rule interact to achieve classification.

We'll tackle a classic linearly separable problem: the logical AND gate. An AND gate outputs 1 only if both of its inputs are 1, otherwise it outputs 0.

The AND Gate Problem

The truth table for the AND gate is:

Input 1 (x1)	Input 2 (x2)	Output (y)
0	0	0
0	1	0
1	0	0
1	1	1

Our goal is to train a Perceptron that takes $(x_1, x_2)$ as input and correctly predicts $y$ . Because we can draw a straight line to separate the points where $y=1$ from the points where $y=0$ in a 2D plot, this problem is linearly separable and solvable by a single Perceptron.

Perceptron Learning Algorithm Recap

Recall the Perceptron learning steps:

Initialization: Set initial values for the weights ( $w_1, w_2$ ) and the bias ( $b$ ). Often, these are initialized to zero or small random numbers.
Activation: For a given input sample $(x_1, x_2)$ , calculate the weighted sum plus bias: $z = w_1 x_1 + w_2 x_2 + b = \mathbf{w} \cdot \mathbf{x} + b$
Prediction: Apply a step function (specifically, the Heaviside step function) to the activation $z$ to get the predicted output $\hat{y}$ : $\hat{y} = \begin{cases} 1 & \text{if } z \ge 0 \\ 0 & \text{if } z < 0 \end{cases}$
Weight Update: Compare the prediction $\hat{y}$ with the true label $y$ . If they differ, update the weights and bias using the Perceptron learning rule: $w_i \leftarrow w_i + \eta (y - \hat{y}) x_i$ $b \leftarrow b + \eta (y - \hat{y})$ Here, $\eta$ (eta) is the learning rate, a small positive value (e.g., 0.1) that controls the step size of the updates. If the prediction is correct ( $y - \hat{y} = 0$ ), no update occurs.

We repeat steps 2-4 for all training samples multiple times (epochs) until the model converges (makes correct predictions for all samples) or a maximum number of epochs is reached.

Implementation with Python and NumPy

Let's implement this. We'll use NumPy for efficient numerical operations.

import numpy as np

# Define the AND gate dataset
# Inputs (X) including a bias term (column of 1s) for convenience later,
# but we'll handle bias separately in the code below for clarity.
X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])
# Outputs (y)
y = np.array([0, 0, 0, 1])

# Initialize weights and bias
# Two inputs, so two weights. Initialize to small random values.
np.random.seed(42) # for reproducibility
weights = np.random.rand(2) * 0.1 # e.g., [0.037, 0.095]
bias = np.random.rand(1) * 0.1    # e.g., [0.073]

# Define the step activation function
def step_function(z):
  return np.where(z >= 0, 1, 0)

# Set learning parameters
learning_rate = 0.1
epochs = 50 # Number of passes through the entire dataset

print(f"Initial weights: {weights}, Initial bias: {bias[0]:.3f}")
print("-" * 30)

# Training loop
for epoch in range(epochs):
    errors = 0
    for i in range(len(X)):
        # Get current input sample and target
        inputs = X[i]
        target = y[i]

        # 1. Calculate weighted sum (activation)
        z = np.dot(inputs, weights) + bias

        # 2. Make prediction
        prediction = step_function(z)

        # 3. Calculate error
        error = target - prediction

        # 4. Update weights and bias if error is non-zero
        if error != 0:
            errors += 1
            weights += learning_rate * error * inputs
            bias += learning_rate * error

    # Print progress (optional)
    if (epoch + 1) % 10 == 0:
      print(f"Epoch {epoch+1}/{epochs}, Errors: {errors}, Weights: [{weights[0]:.3f}, {weights[1]:.3f}], Bias: {bias[0]:.3f}")

    # Check for convergence (no errors in an epoch)
    if errors == 0 and epoch > 0:
        print(f"\nConvergence reached at epoch {epoch+1}.")
        break

print("-" * 30)
print(f"Final weights: [{weights[0]:.3f}, {weights[1]:.3f}]")
print(f"Final bias: {bias[0]:.3f}")

# Test the trained Perceptron
print("\nTesting the trained Perceptron:")
for i in range(len(X)):
    inputs = X[i]
    target = y[i]
    z = np.dot(inputs, weights) + bias
    prediction = step_function(z)
    print(f"Input: {inputs}, Target: {target}, Prediction: {prediction[0]}")

Analyzing the Output

Running the code above should produce output similar to this (exact weights might vary slightly due to random initialization):

Initial weights: [0.03745401 0.09507143], Initial bias: 0.073
------------------------------
Epoch 10/50, Errors: 0, Weights: [0.137, 0.095], Bias: -0.127

Convergence reached at epoch 10.
------------------------------
Final weights: [0.137, 0.095]
Final bias: -0.127

Testing the trained Perceptron:
Input: [0 0], Target: 0, Prediction: 0
Input: [0 1], Target: 0, Prediction: 0
Input: [1 0], Target: 0, Prediction: 0
Input: [1 1], Target: 1, Prediction: 1

You can see the weights and bias adjusting over the epochs. The number of errors decreases until it reaches zero, indicating the Perceptron has learned to correctly classify all input patterns for the AND gate. The final test confirms the predictions match the target outputs.

Visualizing the Decision Boundary

For a 2D problem like this, we can visualize the decision boundary learned by the Perceptron. The boundary is the line where the weighted sum equals zero: $w_1 x_1 + w_2 x_2 + b = 0$ . We can rewrite this to plot $x_2$ as a function of $x_1$ : $x_2 = (-w_1 x_1 - b) / w_2$ .

{"layout": {"title": "Perceptron Decision Boundary for AND Gate", "xaxis": {"title": "Input 1 (x1)", "range": [-0.5, 1.5]}, "yaxis": {"title": "Input 2 (x2)", "range": [-0.5, 1.5]}, "legend": {"title": "Output (y)"}}, "data": [{"x": [0, 0, 1, 1], "y": [0, 1, 0, 1], "mode": "markers", "marker": {"color": ["#fa5252", "#fa5252", "#fa5252", "#1c7ed6"], "size": 12, "symbol": "circle"}, "type": "scatter", "name": "Data Points (0 or 1)"}, {"x": [-0.5, 1.5], "y": [(-0.137 * -0.5 - (-0.127)) / 0.095, (-0.137 * 1.5 - (-0.127)) / 0.095], "mode": "lines", "line": {"color": "#37b24d", "width": 2}, "type": "scatter", "name": "Decision Boundary"}]}

The plot shows the four input points for the AND gate. Points colored red represent an output of 0, and the blue point represents an output of 1. The green line is the decision boundary ( $w_1 x_1 + w_2 x_2 + b = 0$ ) learned by the Perceptron. All points on one side of the line are classified as 0, and points on the other side are classified as 1.

This practical implementation demonstrates the core mechanism of a Perceptron. While simple, it forms the basis for understanding how weights are adjusted during learning. As we saw earlier, this model has limitations (it cannot solve the XOR problem). This motivates the move towards Multi-Layer Perceptrons (MLPs), which add hidden layers to handle more complex, non-linearly separable patterns, as we will explore in subsequent chapters.

Was this section helpful?