All Courses

Getting Started with Retrieve-Augmented Generation (RAG)

Chapter 1: Introduction to Retrieve-Augmented Generation

Limitations of Standard Large Language Models

What is Retrieve-Augmented Generation (RAG)?

The Core Architecture of a RAG System

RAG vs. Fine-tuning: Understanding the Differences

Benefits of Using RAG

Quiz for Chapter 1

Chapter 2: The Retrieval Component

Role of the Retriever in RAG

Introduction to Vector Embeddings

Common Embedding Models

Similarity Search: Finding Relevant Vectors

Introduction to Vector Databases

Choosing a Vector Database: Considerations

Practice: Generating Text Embeddings

Quiz for Chapter 2

Chapter 3: Preparing Data for Retrieval

Loading Documents from Various Sources

The Need for Document Chunking

Fixed-Size Chunking Strategies

Content-Aware Chunking Approaches

Metadata Association with Chunks

Storing Processed Data in a Vector Database

Hands-on Practical: Chunking Documents

Quiz for Chapter 3

Chapter 4: The Generation Component and Augmentation

Role of the Generator (LLM) in RAG

Structuring Prompts for RAG

Context Injection Methods

Managing Context Length Limitations

Generating the Final Response

Attributing Sources in Generated Output

Quiz for Chapter 4

Chapter 5: Building a Basic RAG Pipeline

Overview of RAG Frameworks (e.g., LangChain, LlamaIndex)

Setting up the Environment

Implementing the Retriever

Implementing the Generator Integration

Combining Retrieval and Generation

Running Queries Through the Pipeline

Hands-on Practical: End-to-End RAG System

Quiz for Chapter 5

Chapter 6: Evaluating and Improving RAG Systems

Challenges in Evaluating RAG

Component-Level Evaluation: Retrieval

Component-Level Evaluation: Generation

End-to-End RAG Evaluation Frameworks

Common Failure Modes

Basic Strategies for Improvement

Practice: Analyzing RAG Output Quality

Quiz for Chapter 6

Implementing the Generator Integration

Was this section helpful?

References

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela, 2020 Advances in Neural Information Processing Systems, Vol. 33 (Curran Associates, Inc.) DOI: 10.48550/arXiv.2005.11401 - This paper introduced the concept of Retrieval-Augmented Generation (RAG), a method for combining a retriever with a generator for better knowledge-grounded responses.
Hugging Face Transformers Documentation, Hugging Face, 2024 (Hugging Face) - Provides official guides and API references for using the transformers library to load and run pre-trained models, including local LLMs.
OpenAI API Documentation, OpenAI, 2024 (OpenAI) - The official resource for integrating with OpenAI's models, detailing API usage, authentication, and specific model endpoints.
LangChain Documentation, LangChain, 2024 - Introduces the LangChain framework for building applications with LLMs, offering abstractions for connecting LLMs as generators in RAG pipelines.

© 2025 ApX Machine LearningEngineered with