All Courses

Introduction to LLM Fine-Tuning

Chapter 1: Foundations of Model Customization

What is Fine-Tuning?

Pre-training vs. Fine-Tuning

When to Fine-Tune: An Analytical Framework

Overview of Fine-Tuning Strategies

The Role of Transfer Learning in LLMs

Setting Up Your Development Environment

Chapter 2: Data Preparation for Fine-Tuning

Sourcing and Selecting High-Quality Datasets

Instruction-Based vs. Conversational Data Formats

Data Cleaning and Preprocessing Techniques

Creating and Structuring Custom Datasets

Tokenization for Fine-Tuning

Hands-on Practical: Building a Fine-Tuning Dataset

Chapter 3: Full Parameter Fine-Tuning

The Mechanics of Full Fine-Tuning

Architectural Considerations for Full Fine-Tuning

Managing Computational Resources

Configuring Training Arguments and Hyperparameters

Monitoring Training: Loss and Metrics

Saving and Loading Fine-Tuned Models

Practice: Full Fine-Tuning on a Small-Scale Model

Chapter 4: Parameter-Efficient Fine-Tuning (PEFT)

Introduction to Parameter-Efficient Fine-Tuning

Low-Rank Adaptation (LoRA): Theory and Operation

Implementing LoRA with the PEFT Library

Quantization and its effect on Fine-Tuning (QLoRA)

Other PEFT Methods: A Brief Survey

Comparing PEFT and Full Fine-Tuning Trade-offs

Hands-on Practical: Fine-Tuning with LoRA

Chapter 5: Evaluation and Deployment

Defining Performance Metrics for Generative Tasks

Quantitative Evaluation: ROUGE, BLEU, and Perplexity

Qualitative Evaluation: Human-in-the-Loop Assessment

Building an Evaluation Pipeline

Strategies for Merging Adapters with the Base Model

Preparing Models for Inference

Practice: Evaluating a Fine-Tuned Model

Introduction to Parameter-Efficient Fine-Tuning

Build LLM apps faster with Kerb

Cleaner syntax. Built-in debugging. Production-ready from day one.

Built for the AI systems behind ApX Machine Learning

Was this section helpful?

References

LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021 International Conference on Learning Representations (ICLR) DOI: 10.48550/arXiv.2106.09685 - Presents LoRA, a method that greatly reduces the number of trainable parameters by injecting low-rank matrices into the transformer architecture.
Parameter-Efficient Transfer Learning for NLP, Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly, 2019 International Conference on Machine Learning (ICML) DOI: 10.48550/arXiv.1902.00751 - Introduces Adapter Tuning, an early PEFT technique where small, task-specific neural modules (adapters) are inserted into a pre-trained model.
Prefix-Tuning: Optimizing Continuous Prompts for Generation, Xiang Lisa Li, Percy Liang, 2021 Annual Meeting of the Association for Computational Linguistics (ACL) DOI: 10.48550/arXiv.2101.00190 - Describes Prefix-Tuning, a PEFT method that optimizes a small sequence of continuous task-specific vectors (prefixes) prepended to the input.
Hugging Face PEFT Library Documentation, Hugging Face team (Hugging Face) - Official documentation for the Hugging Face PEFT library, offering practical guides and explanations for various parameter-efficient fine-tuning methods.

© 2026 ApX Machine LearningEngineered with