Let's put the concepts from this chapter into practice by constructing a simple neural network. We'll build a small feed-forward network designed for a binary classification task. Assume we have input data with two features and want to classify each data point into one of two categories (0 or 1).
Recall that the foundation for any PyTorch model is the torch.nn.Module
class. We create our custom network by subclassing nn.Module
and defining the layers in the __init__
method and the data flow in the forward
method.
We'll create a network with:
Here's the Python code defining this architecture:
import torch
import torch.nn as nn
import torch.optim as optim
# Define the network structure
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__() # Initialize the parent class
self.layer_1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.layer_2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
# Define the forward pass
out = self.layer_1(x)
out = self.relu(out)
out = self.layer_2(out)
# Note: We don't apply Sigmoid here if using BCEWithLogitsLoss later
return out
# Define network parameters
input_features = 2
hidden_units = 10
output_classes = 1 # Single output for binary classification logit
# Instantiate the network
model = SimpleNet(input_features, hidden_units, output_classes)
# Print the model structure
print(model)
Running this code will print the structure of our newly defined network, showing the layers and their order:
SimpleNet(
(layer_1): Linear(in_features=2, out_features=10, bias=True)
(relu): ReLU()
(layer_2): Linear(in_features=10, out_features=1, bias=True)
)
This output confirms we have a linear layer mapping 2 input features to 10 hidden units, followed by a ReLU activation, and finally another linear layer mapping the 10 hidden units to a single output value.
Before we can train (which we'll cover in detail later), we need to instantiate the model, define a loss function, and choose an optimizer. Let's set these up:
# --- Data Preparation (Example Placeholder) ---
# Imagine we have some input data (X) and target labels (y)
# For this example, let's create dummy tensors
# A mini-batch of 5 samples, each with 2 features
dummy_input = torch.randn(5, input_features)
# Corresponding dummy labels (0 or 1) - need float for BCEWithLogitsLoss
dummy_labels = torch.randint(0, 2, (5, 1)).float()
# --- Instantiate Model, Loss, and Optimizer ---
# Model already instantiated above: model = SimpleNet(...)
# Loss function: Binary Cross Entropy with Logits
# This loss is suitable for binary classification and expects raw logits as input
criterion = nn.BCEWithLogitsLoss()
# Optimizer: Adam is a popular choice
# We pass the model's parameters to the optimizer
learning_rate = 0.01
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
print(f"\nUsing loss: {criterion}")
print(f"Using optimizer: {optimizer}")
Now that we have the model, loss function, and optimizer ready, let's simulate a single step of the training process to see how these components interact. This involves:
# --- Simulate a Single Training Step ---
# 1. Forward Pass: Get model predictions (logits)
outputs = model(dummy_input)
print(f"\nModel outputs (logits) shape: {outputs.shape}")
# print(f"Sample outputs: {outputs.detach().numpy().flatten()}") # Optional: view outputs
# 2. Calculate Loss
loss = criterion(outputs, dummy_labels)
print(f"Calculated loss: {loss.item():.4f}") # .item() gets the scalar value
# 3. Backward Pass: Compute gradients
# First, ensure gradients are zeroed from previous steps (important in a real loop)
optimizer.zero_grad()
loss.backward() # Calculate gradients of loss w.r.t. model parameters
# 4. Optimizer Step: Update model weights
optimizer.step() # Update parameters based on calculated gradients
# --- Inspect Parameters (Optional) ---
# You can inspect gradients after the backward pass (before optimizer.step())
# print("\nGradients for layer_1 weights (sample):")
# print(model.layer_1.weight.grad[0, :]) # Access gradient of a specific parameter
# Or inspect parameter values after the step
# print("\nUpdated layer_1 weights (sample):")
# print(model.layer_1.weight[0, :])
In this step, we performed a forward pass to get the raw outputs (logits) from our SimpleNet
. We then used BCEWithLogitsLoss
to compute the difference between these outputs and our dummy_labels
. Calling loss.backward()
triggered Autograd to compute the gradients for all parameters where requires_grad=True
(which includes the weights and biases of our nn.Linear
layers by default). Finally, optimizer.step()
updated the model's parameters using the computed gradients and the Adam optimization algorithm. Remember to call optimizer.zero_grad()
before the next backward pass in a real training loop to prevent gradient accumulation.
You have now successfully built a simple neural network using torch.nn
, defined its layers and forward pass, instantiated it, and connected it with a loss function and an optimizer, ready for the next stage: training.
© 2025 ApX Machine Learning