After the forward pass, your model has generated predictions (often called outputs
or logits
) based on the input data batch. The next logical step in the training loop is to evaluate how well these predictions align with the actual ground truth labels. This is where the loss function comes into play.
A loss function, also known as a criterion or objective function, mathematically measures the discrepancy between the model's predictions (y^) and the true target values (y). The goal of training is typically to minimize this loss value. A smaller loss indicates that the model's predictions are closer to the actual targets for the given batch of data.
In PyTorch, loss functions are readily available within the torch.nn
module, just like model layers and activation functions. You usually instantiate a loss function once outside the training loop. Common choices include:
nn.MSELoss
(Mean Squared Error): Often used for regression tasks where the goal is to predict continuous values. It calculates the average squared difference between predictions and targets.
LMSE=N1i=1∑N(yi−y^i)2
where N is the number of samples in the batch.nn.CrossEntropyLoss
: A standard choice for multi-class classification problems. This criterion conveniently combines nn.LogSoftmax
and nn.NLLLoss
(Negative Log Likelihood Loss) in one class. It expects raw, unnormalized scores (logits) directly from the model's final layer as input and target class indices (integers) as labels.nn.BCEWithLogitsLoss
: Used for binary classification or multi-label classification tasks. Similar to CrossEntropyLoss
, it combines a Sigmoid layer and the Binary Cross Entropy loss in one step for better numerical stability. It also expects raw logits as input.Once you have instantiated your chosen criterion (e.g., criterion = nn.CrossEntropyLoss()
), calculating the loss within the loop is straightforward. You simply pass the model's output tensor and the tensor containing the true labels to the criterion object:
# --- Inside the training loop ---
# Assume:
# model: Your neural network model (e.g., an nn.Module instance)
# criterion: Your chosen loss function (e.g., nn.CrossEntropyLoss())
# inputs: Batch of input data from DataLoader
# labels: Batch of corresponding true labels from DataLoader
# 1. Forward Pass (already done)
outputs = model(inputs)
# 2. Calculate the Loss
loss = criterion(outputs, labels)
# 'loss' now holds the computed loss value for the current batch.
# It's a scalar tensor (a tensor with only one element).
# 3. Next steps: Backpropagation (loss.backward()), Optimizer step...
# --- End of loop iteration snippet ---
It's important to understand what the resulting loss
variable represents:
loss
tensor is still connected to the computation graph that PyTorch built during the forward pass. It knows which operations and which model parameters contributed to its final value.requires_grad=True
, the loss
tensor itself implicitly has requires_grad=True
. This allows us to call loss.backward()
in the next step to automatically compute the gradients of the loss with respect to all the model's learnable parameters (∇θL).This computed loss
value serves as the starting point for the backpropagation process, which adjusts the model's weights to hopefully produce a lower loss on subsequent iterations.
© 2025 ApX Machine Learning