Initiating the training process involves binding the model architecture, tokenized datasets, and defined hyperparameters into an automated sequence of forward passes, loss calculations, and weight updates. The Hugging Face Trainer class manages this process by abstracting away much of the boilerplate PyTorch code while maintaining fine-grained control over the execution state.
Before committing to a full training run that might take hours or days, it is standard practice to test the pipeline on a small subset of your data. This verifies that your memory constraints are satisfied and that the loss decreases as expected without encountering out-of-memory errors during computation.
# Select a small subset of examples for the practice run
small_train_dataset = tokenized_dataset["train"].select(range(500))
small_eval_dataset = tokenized_dataset["test"].select(range(100))
Running a restricted dataset allows you to quickly validate your data collator and ensure the inputs are properly padded to the maximum length of each batch.
To execute the loop, instantiate the Trainer class. You must pass the base model with its attached LoRA adapters, the training arguments defined in the previous section, the split datasets, and the data collator. The data collator prepares the raw tokenized sequences for the model by organizing them into uniform tensors.
from transformers import Trainer, DataCollatorForLanguageModeling
# Create a data collator that ignores the instruction padding
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False
)
trainer = Trainer(
model=peft_model,
args=training_args,
train_dataset=small_train_dataset,
eval_dataset=small_eval_dataset,
data_collator=data_collator,
)
By setting mlm=False, you instruct the collator to process the data for causal language modeling, which is the standard approach for generative tasks where the model predicts the next token in a sequence.
Initiate the process by calling the train method on the instantiated object.
# Start the training loop
training_results = trainer.train()
When you execute this command, the automated loop begins. The engine fetches batches of data, passes them through the model, and calculates the difference between the generated tokens and the actual target tokens using cross-entropy loss.
The sequence of operations executed during a single training iteration.
Because you are using Parameter-Efficient Fine-Tuning, the backward pass only computes gradients for the injected LoRA matrices. The frozen base model weights remain untouched, saving massive amounts of computational resources.
The cross-entropy loss for a single sequence of length is calculated as:
Here, is the probability the model assigns to the correct next token given all previous context tokens. As the training loop runs, the optimizer adjusts the adapter weights to maximize this probability, causing the overall loss value to decrease.
As the loop iterates, you will see output logs in your terminal at the intervals you specified in your training arguments. You should monitor two primary metrics: the training loss and the evaluation loss.
Training loss compared to evaluation loss over 300 steps. The divergence at the end indicates early signs of overfitting.
A successful run exhibits a steady decline in both training and evaluation loss. If the training loss continues to decrease but the evaluation loss begins to climb, your model has started to memorize the training subset. This means it is losing its ability to generalize to new data. If you observe this behavior, you can stop the training loop early or adjust your learning rate and weight decay parameters.
Once the loop successfully processes the predefined number of epochs, you must save the newly trained adapter weights to disk.
# Save the trained LoRA adapters
trainer.save_model("./custom-slm-lora-adapters")
It is important to remember that because you used LoRA, you are not saving the entire multi-gigabyte language model. You are only saving the newly trained low-rank matrices, which typically take up just a few megabytes of storage. In the upcoming deployment phase, these lightweight adapter files will be loaded on top of the original base model to alter its text generation behavior.
Was this section helpful?
© 2026 ApX Machine LearningAI Ethics & Transparency•