All Courses

Fine-Tuning a Small Language Model

Chapter 1: Principles of Small Language Models

What is a Small Language Model

Fine-Tuning vs Retrieval-Augmented Generation

Supervised Fine-Tuning Mechanics

Hardware Requirements and Memory Constraints

Hands-On Practical: Initializing a Pre-Trained SLM

Chapter 2: Data Preparation and Formatting

Structuring Instruction Datasets

Tokenization and Padding Strategies

Handling Attention Masks

Formatting Prompts for Specific Architectures

Practice: Building a Custom Dataset Pipeline

Chapter 3: Environment and Library Setup

Configuring PyTorch and CUDA

Introduction to the Hugging Face Transformers Library

Managing Datasets with Hugging Face Datasets

Optimizing Memory with Accelerate

Hands-On Practical: Configuring the Training Script

Chapter 4: Parameter-Efficient Fine-Tuning (PEFT)

Understanding Full Fine-Tuning Limitations

Low-Rank Adaptation (LoRA) Principles

Quantized LoRA (QLoRA) and 4-bit Training

Configuring Target Modules and Rank

Hands-On Practical: Implementing a LoRA Configuration

Chapter 5: The Training Process

Defining Training Arguments and Hyperparameters

Learning Rates and Schedulers

Checkpointing and State Management

Monitoring Loss and Training Metrics

Practice: Executing the Training Loop

Chapter 6: Model Evaluation and Benchmarking

Evaluating Text Generation Quality

Quantitative Metrics for NLP Tasks

Testing Prompt Generalization

Identifying Overfitting in Generation

Hands-On Practical: Running Evaluation Scripts

Chapter 7: Model Merging and Deployment

Merging LoRA Adapters with Base Models

Exporting Models to Safetensors

Serving SLMs with vLLM

API Integration Strategies

Practice: Deploying the Custom Model Locally

APX AI

Online

I can see the page you're looking at. Ask me anything!

All Courses

Fine-Tuning a Small Language Model

Chapter 1: Principles of Small Language Models

What is a Small Language Model

Fine-Tuning vs Retrieval-Augmented Generation

Supervised Fine-Tuning Mechanics

Hardware Requirements and Memory Constraints

Hands-On Practical: Initializing a Pre-Trained SLM

Chapter 2: Data Preparation and Formatting

Structuring Instruction Datasets

Tokenization and Padding Strategies

Handling Attention Masks

Formatting Prompts for Specific Architectures

Practice: Building a Custom Dataset Pipeline

Chapter 3: Environment and Library Setup

Configuring PyTorch and CUDA

Introduction to the Hugging Face Transformers Library

Managing Datasets with Hugging Face Datasets

Optimizing Memory with Accelerate

Hands-On Practical: Configuring the Training Script

Chapter 4: Parameter-Efficient Fine-Tuning (PEFT)

Understanding Full Fine-Tuning Limitations

Low-Rank Adaptation (LoRA) Principles

Quantized LoRA (QLoRA) and 4-bit Training

Configuring Target Modules and Rank

Hands-On Practical: Implementing a LoRA Configuration

Chapter 5: The Training Process

Defining Training Arguments and Hyperparameters

Learning Rates and Schedulers

Checkpointing and State Management

Monitoring Loss and Training Metrics

Practice: Executing the Training Loop

Chapter 6: Model Evaluation and Benchmarking

Evaluating Text Generation Quality

Quantitative Metrics for NLP Tasks

Testing Prompt Generalization

Identifying Overfitting in Generation

Hands-On Practical: Running Evaluation Scripts

Chapter 7: Model Merging and Deployment

Merging LoRA Adapters with Base Models

Exporting Models to Safetensors

Serving SLMs with vLLM

API Integration Strategies

Practice: Deploying the Custom Model Locally