All Courses

Getting Started with Retrieve-Augmented Generation (RAG)

Chapter 1: Introduction to Retrieve-Augmented Generation

Limitations of Standard Large Language Models

What is Retrieve-Augmented Generation (RAG)?

The Core Architecture of a RAG System

RAG vs. Fine-tuning: Understanding the Differences

Benefits of Using RAG

Quiz for Chapter 1

Chapter 2: The Retrieval Component

Role of the Retriever in RAG

Introduction to Vector Embeddings

Common Embedding Models

Similarity Search: Finding Relevant Vectors

Introduction to Vector Databases

Choosing a Vector Database: Considerations

Practice: Generating Text Embeddings

Quiz for Chapter 2

Chapter 3: Preparing Data for Retrieval

Loading Documents from Various Sources

The Need for Document Chunking

Fixed-Size Chunking Strategies

Content-Aware Chunking Approaches

Metadata Association with Chunks

Storing Processed Data in a Vector Database

Hands-on Practical: Chunking Documents

Quiz for Chapter 3

Chapter 4: The Generation Component and Augmentation

Role of the Generator (LLM) in RAG

Structuring Prompts for RAG

Context Injection Methods

Managing Context Length Limitations

Generating the Final Response

Attributing Sources in Generated Output

Quiz for Chapter 4

Chapter 5: Building a Basic RAG Pipeline

Overview of RAG Frameworks (e.g., LangChain, LlamaIndex)

Setting up the Environment

Implementing the Retriever

Implementing the Generator Integration

Combining Retrieval and Generation

Running Queries Through the Pipeline

Hands-on Practical: End-to-End RAG System

Quiz for Chapter 5

Chapter 6: Evaluating and Improving RAG Systems

Challenges in Evaluating RAG

Component-Level Evaluation: Retrieval

Component-Level Evaluation: Generation

End-to-End RAG Evaluation Frameworks

Common Failure Modes

Basic Strategies for Improvement

Practice: Analyzing RAG Output Quality

Quiz for Chapter 6

Common Embedding Models

Was this section helpful?

References

Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.) DOI: 10.48550/arXiv.1706.03762 - This paper introduced the Transformer architecture, which underpins modern NLP models like BERT and SBERT.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, 2019 Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Vol. 1 (Association for Computational Linguistics) DOI: 10.18653/v1/N19-1423 - Presents the BERT model, a foundational pre-trained language model that later served as a base for specialized models like SBERT.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Nils Reimers, Iryna Gurevych, 2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (Association for Computational Linguistics) DOI: 10.18653/v1/D19-1410 - The seminal paper introducing Sentence-BERT, specifically designed for generating high-quality sentence embeddings suitable for similarity tasks.
Sentence-Transformers Documentation, Nils Reimers, 2024 (Hugging Face) - Official documentation for the sentence-transformers library, offering practical guides, model examples, and implementation details.

© 2025 ApX Machine LearningEngineered with