Masterclass
Pre-trained large language models possess broad knowledge and generative capabilities learned from vast text corpora. However, this pre-training objective doesn't inherently guarantee that a model will follow specific instructions, converse naturally, or adhere to safety guidelines. Bridging this gap requires alignment, the process of adapting the model to better match human intent and preferences.
This chapter focuses on Supervised Fine-Tuning (SFT), often the initial step in the alignment process. SFT uses curated datasets of prompt-response examples to teach the model desired behaviors, such as instruction following or specific task execution. We will cover the objectives motivating LLM alignment, the detailed mechanics of the SFT process, methods for creating effective instruction datasets, appropriate data formatting, considerations for the training loop and hyperparameters, and techniques for evaluating how well SFT achieves its alignment goals.
25.1 Goals of LLM Alignment
25.2 Introduction to Supervised Fine-Tuning (SFT)
25.3 Creating High-Quality Instruction Datasets
25.4 Data Formatting for SFT (Prompts, Completions)
25.5 The SFT Training Process and Hyperparameters
25.6 Evaluating SFT Models on Alignment Goals
© 2025 ApX Machine Learning