All Courses

Getting Started with Kerb Toolkit

Chapter 1: Introduction and First Text Generation

Course Overview and Setup

Principles of the Toolkit

Configuring LLM Providers

Executing Your First Generation Call

Handling Streaming Responses

Chapter 2: Advanced Prompting Techniques

Introduction to Prompt Engineering

Creating Dynamic Prompts with the Template Engine

Managing and Versioning Prompts

Implementing Few-Shot Prompting

Extracting Structured Data from LLM Outputs

Parsing JSON and Code Snippets

Chapter 3: Managing Context and Tokens

The Importance of the Context Window

Counting Tokens with the Tokenizer

Strategies for Text Truncation

Managing Token Budgets for Complex Prompts

Chapter 4: Preparing Data for Retrieval

Data Loading Fundamentals

Loading Documents from Different Sources

The Rationale Behind Text Chunking

Applying Chunking Strategies

Text Preprocessing for Better Retrieval

Chapter 5: Embeddings and Semantic Search

Understanding Text Embeddings

Generating Embeddings

Fundamentals of Vector Similarity

Performing Semantic Search

Choosing an Embedding Model

Chapter 6: Building Retrieval-Augmented Generation (RAG) Systems

Anatomy of a RAG System

Creating a Simple Retrieval Pipeline

Implementing Different Search Methods

Improving Relevance with Re-ranking

Managing Retrieved Context for Generation

Chapter 7: Building Conversational Applications with Memory

The Challenge of Stateful Conversations

Implementing Conversation Buffer Memory

Using Summary Memory for Long Conversations

Tracking Entities Across a Conversation

Chapter 8: Developing Autonomous Agents

Introduction to LLM Agents

The ReAct Pattern for Reasoning and Acting

Building a ReAct Agent

Defining and Using Tools

Implementing Plan-and-Execute Agents

Orchestrating Multi-Agent Systems

Chapter 9: Optimizing for Performance and Cost

Identifying Performance Bottlenecks

Implementing LLM Response Caching

Caching Embeddings to Reduce API Calls

Cache Invalidation Strategies

Chapter 10: Ensuring Application Safety and Reliability

Adding Safety Guardrails to Applications

Implementing Content Moderation

Detecting and Masking Personal Information

Introduction to Testing LLM Applications

Using Mocks for Deterministic Tests

Chapter 11: Advanced Capabilities

Processing Image Inputs with Multimodal Models

Benchmarking LLM Outputs with Evaluation Metrics

Preparing Datasets for Fine-Tuning

Processing Image Inputs with Multimodal Models

Was this section helpful?

References

Learning Transferable Visual Models From Natural Language Supervision, Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever, 2021 arXiv preprint arXiv:2103.00020 DOI: 10.48550/arXiv.2103.00020 - Describes CLIP, a foundational model that aligns images and text, enabling zero-shot recognition and serving as a basis for many modern vision-language models.
Vision (GPT-4o) - OpenAI API Reference, OpenAI, 2024 (OpenAI) - Official guide on using OpenAI's vision capabilities, including details on image formatting (Base64), prompt construction, and supported models like GPT-4o.
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Aurélien Géron, 2022 (O'Reilly Media) - A practical guide covering deep learning concepts, including image processing and neural network architectures, valuable for understanding the foundations before working with advanced multimodal models.

© 2025 ApX Machine LearningEngineered with