All Courses

Getting Started with Retrieve-Augmented Generation (RAG)

Chapter 1: Introduction to Retrieve-Augmented Generation

Limitations of Standard Large Language Models

What is Retrieve-Augmented Generation (RAG)?

The Core Architecture of a RAG System

RAG vs. Fine-tuning: Understanding the Differences

Benefits of Using RAG

Quiz for Chapter 1

Chapter 2: The Retrieval Component

Role of the Retriever in RAG

Introduction to Vector Embeddings

Common Embedding Models

Similarity Search: Finding Relevant Vectors

Introduction to Vector Databases

Choosing a Vector Database: Considerations

Practice: Generating Text Embeddings

Quiz for Chapter 2

Chapter 3: Preparing Data for Retrieval

Loading Documents from Various Sources

The Need for Document Chunking

Fixed-Size Chunking Strategies

Content-Aware Chunking Approaches

Metadata Association with Chunks

Storing Processed Data in a Vector Database

Hands-on Practical: Chunking Documents

Quiz for Chapter 3

Chapter 4: The Generation Component and Augmentation

Role of the Generator (LLM) in RAG

Structuring Prompts for RAG

Context Injection Methods

Managing Context Length Limitations

Generating the Final Response

Attributing Sources in Generated Output

Quiz for Chapter 4

Chapter 5: Building a Basic RAG Pipeline

Overview of RAG Frameworks (e.g., LangChain, LlamaIndex)

Setting up the Environment

Implementing the Retriever

Implementing the Generator Integration

Combining Retrieval and Generation

Running Queries Through the Pipeline

Hands-on Practical: End-to-End RAG System

Quiz for Chapter 5

Chapter 6: Evaluating and Improving RAG Systems

Challenges in Evaluating RAG

Component-Level Evaluation: Retrieval

Component-Level Evaluation: Generation

End-to-End RAG Evaluation Frameworks

Common Failure Modes

Basic Strategies for Improvement

Practice: Analyzing RAG Output Quality

Quiz for Chapter 6

Generating the Final Response

New · Open Source

Kerb - LLM Development Toolkit

Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.

Was this section helpful?

References

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela, 2020 Advances in Neural Information Processing Systems (NeurIPS), Vol. 33 (Curran Associates, Inc.) DOI: 10.48550/arXiv.2005.11401 - Introduces the Retrieval-Augmented Generation (RAG) framework, detailing how the generator component synthesizes information from retrieved documents and an input query to produce a final response.
Retrieval-Augmented Generation for Large Language Models: A Survey, Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, Haofen Wang, 2023 arXiv preprint arXiv:2312.10997 DOI: 10.48550/arXiv.2312.10997 - Provides an extensive overview of RAG, including detailed discussions on the generation component, prompting techniques, and strategies for handling issues like conflicting information and maintaining response quality.

© 2025 ApX Machine Learning