You've prepared your data using Dataset
and created a DataLoader
to handle batching, shuffling, and potentially parallel loading. Now, let's put it to work inside the training loop. The DataLoader
acts as a Python iterable, making it simple to feed data batches to your model systematically during each training epoch.
The standard approach is to use a Python for
loop. In each iteration, the DataLoader
yields one batch of data, typically containing both input features and their corresponding target labels.
# Assume these are already defined and configured:
# train_dataloader = DataLoader(your_dataset, batch_size=64, shuffle=True)
# model = YourNeuralNetwork()
# loss_fn = torch.nn.CrossEntropyLoss() # Example loss function
# optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Example optimizer
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# model.to(device) # Ensure model is on the correct device
num_epochs = 10 # Example number of passes over the dataset
# Outer loop for epochs
for epoch in range(num_epochs):
print(f"Epoch {epoch+1}\n-------------------------------")
# Set the model to training mode.
# This enables features like dropout and batch normalization updates.
model.train()
# Inner loop for batches within an epoch
# Iterate over batches provided by the DataLoader
for batch_idx, data_batch in enumerate(train_dataloader):
# 1. Unpack the batch
# The structure depends on your Dataset's __getitem__ method.
# For supervised learning, it's commonly (inputs, labels).
inputs, labels = data_batch
# 2. Move data to the target device (GPU or CPU)
# This MUST match the device where the model resides.
inputs = inputs.to(device)
labels = labels.to(device)
# ---> The next steps (forward pass, loss, backprop, optimize) <---
# ---> using 'inputs' and 'labels' happen here. <---
# (These are detailed in the subsequent sections)
# Example placeholder for where subsequent logic goes:
# predictions = model(inputs)
# loss = loss_fn(predictions, labels)
# optimizer.zero_grad()
# loss.backward()
# optimizer.step()
# Optional: Print progress periodically
if batch_idx % 100 == 0:
current_batch_size = len(inputs) # Get size of the current batch
# Replace 0.0 with the actual calculated loss for logging
current_loss = 0.0
print(f" Batch {batch_idx}: [{current_batch_size} samples] Current Loss: {current_loss:.4f}") # Example log
# ---> Evaluation loop on validation data often follows here <---
# (We'll cover evaluation loops later in this chapter)
print("Training finished!")
Let's break down the key parts of this inner loop:
for epoch in range(num_epochs):
loop controls how many times we iterate over the entire dataset.model.train()
is called at the start of each epoch. This is important because layers like torch.nn.Dropout
or torch.nn.BatchNorm2d
have different behaviors during training (e.g., applying dropout, updating running statistics) compared to evaluation. This call ensures they are in the correct mode.for batch_idx, data_batch in enumerate(train_dataloader):
is the core iteration. enumerate
gives us a batch counter (batch_idx
), and train_dataloader
yields one data_batch
at a time.data_batch
into inputs
and labels
. The structure returned by the DataLoader
directly corresponds to the structure returned by the __getitem__
method of the Dataset
it wraps. For typical supervised tasks, this is a tuple or list containing features and targets.inputs.to(device)
and labels.to(device)
are essential. Neural network computations, especially the forward pass through the model
, require that the model's parameters and the input data reside on the same compute device (e.g., all on the CPU, or all on a specific GPU). This step moves the batch data fetched by the DataLoader
(which usually resides in CPU RAM) to the device where the model
was placed. Failure to do this synchronization is a frequent source of runtime errors. This also ensures that calculations can benefit from GPU acceleration if device
is set to a CUDA device.It's worth noting that if your DataLoader
was initialized with drop_last=False
(which is the default setting), the very last batch produced in an epoch might contain fewer samples than the specified batch_size
. This happens if the total number of samples in your dataset is not perfectly divisible by the batch size. PyTorch operations generally handle variable batch sizes gracefully, but be mindful of this if you perform any calculations that assume a fixed batch size (like averaging loss over a fixed number).
With the data batch (inputs
, labels
) successfully loaded onto the target device, you are now fully prepared for the main computational steps within the training iteration loop:
inputs
to the model to get predictions (the Forward Pass).labels
using a loss function (Calculating the Loss).These steps, performed repeatedly for each batch, form the heart of the model training process and are the focus of the sections that follow.
© 2025 ApX Machine Learning