All Courses

Getting Started with Kerb Toolkit

Chapter 1: Introduction and First Text Generation

Course Overview and Setup

Principles of the Toolkit

Configuring LLM Providers

Executing Your First Generation Call

Handling Streaming Responses

Chapter 2: Advanced Prompting Techniques

Introduction to Prompt Engineering

Creating Dynamic Prompts with the Template Engine

Managing and Versioning Prompts

Implementing Few-Shot Prompting

Extracting Structured Data from LLM Outputs

Parsing JSON and Code Snippets

Chapter 3: Managing Context and Tokens

The Importance of the Context Window

Counting Tokens with the Tokenizer

Strategies for Text Truncation

Managing Token Budgets for Complex Prompts

Chapter 4: Preparing Data for Retrieval

Data Loading Fundamentals

Loading Documents from Different Sources

The Rationale Behind Text Chunking

Applying Chunking Strategies

Text Preprocessing for Better Retrieval

Chapter 5: Embeddings and Semantic Search

Understanding Text Embeddings

Generating Embeddings

Fundamentals of Vector Similarity

Performing Semantic Search

Choosing an Embedding Model

Chapter 6: Building Retrieval-Augmented Generation (RAG) Systems

Anatomy of a RAG System

Creating a Simple Retrieval Pipeline

Implementing Different Search Methods

Improving Relevance with Re-ranking

Managing Retrieved Context for Generation

Chapter 7: Building Conversational Applications with Memory

The Challenge of Stateful Conversations

Implementing Conversation Buffer Memory

Using Summary Memory for Long Conversations

Tracking Entities Across a Conversation

Chapter 8: Developing Autonomous Agents

Introduction to LLM Agents

The ReAct Pattern for Reasoning and Acting

Building a ReAct Agent

Defining and Using Tools

Implementing Plan-and-Execute Agents

Orchestrating Multi-Agent Systems

Chapter 9: Optimizing for Performance and Cost

Identifying Performance Bottlenecks

Implementing LLM Response Caching

Caching Embeddings to Reduce API Calls

Cache Invalidation Strategies

Chapter 10: Ensuring Application Safety and Reliability

Adding Safety Guardrails to Applications

Implementing Content Moderation

Detecting and Masking Personal Information

Introduction to Testing LLM Applications

Using Mocks for Deterministic Tests

Chapter 11: Advanced Capabilities

Processing Image Inputs with Multimodal Models

Benchmarking LLM Outputs with Evaluation Metrics

Preparing Datasets for Fine-Tuning

The Importance of the Context Window

Was this section helpful?

References

Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.) - Introduces the Transformer architecture, which forms the basis for Large Language Models and their use of attention mechanisms over input sequences, defining the conceptual context window.
OpenAI Platform - Models, OpenAI, 2024 - Official documentation detailing the available Large Language Models, their respective context window sizes (in tokens), and other API limits, which are crucial for managing application reliability and cost.

© 2025 ApX Machine LearningEngineered with