Open Source

Kerb

LLM 开发工具包

一个全面的开源 Python 工具包，用于构建生产就绪的 LLM 应用程序。为 LLM 工作流程的每个阶段提供模块化实用工具。

驱动 ApX 机器学习的人工智能实用工具，现已开源。

在 GitHub 上查看

pip install kerb

为 LLM 开发者设计

为现代 LLM 应用程序设计的实用工具集合。

简单

先进的 LLM 技术变得简单。它为复杂的操作提供干净、易于使用的界面。

轻量级

只安装您需要的。Kerb 是模块化的，因此您不必携带不必要的依赖项。

兼容

适用于任何 LLM 项目。Kerb 不是一个框架，而是一个可以与其他工具良好协作的工具包。

所有模块

构建 LLM 应用程序所需的一切

Agent

用于多步推理的代理编排和执行模式。

Cache

响应和嵌入缓存以降低成本和延迟。

Chunk

用于优化上下文窗口和检索的文本分块工具。

Config

模型、提供商和应用程序设置的配置管理。

Context

上下文窗口管理和令牌预算跟踪。

Document

文档加载和处理，支持 PDF、网页等格式。

Embedding

嵌入生成和相似度搜索辅助工具。

Evaluation

LLM 输出的指标和基准测试工具。

Fine-Tuning

模型微调工具和大型数据集准备。

Generation

统一的 LLM 生成，支持多提供商（OpenAI、Anthropic、Gemini）。

Memory

用于有状态应用程序的对话记忆和实体跟踪。

Multimodal

用于多模态模型的图像、音频和视频处理。

Parsing

输出解析和验证（JSON、结构化数据、函数调用）。

Preprocessing

LLM 输入的文本清理和预处理。

Prompt

提示工程实用工具、模板和思维链模式。

Retrieval

用于语义检索的 RAG 和向量搜索工具。

Safety

内容审核和安全过滤器。

Testing

LLM 输出和评估的测试工具。

Tokenizer

任何模型的令牌计数和文本分割。

快速开始

安装

# Install just the basics (no dependencies)
pip install kerb

# Or install with the features you need
pip install kerb[generation]  # For LLM generation
pip install kerb[embeddings]  # For embeddings
pip install kerb[all]         # Everything

1. 基础 LLM 生成

使用任何主流 LLM 提供商生成文本：

from kerb.generation import generate, ModelName, LLMProvider

# Simple generation
response = generate(
    "Write a haiku about Python programming",
    model=ModelName.GPT_4O_MINI,
    provider=LLMProvider.OPENAI
)

print(response.content)
print(f"Tokens: {response.usage.total_tokens}, Cost: $ {response.cost:.6f}")

2. RAG 文本分块

将大型文档拆分以供 LLM 处理：

from kerb.chunk import overlap_chunker

long_text = """
Large Language Models have revolutionized natural language processing.
They can understand context, generate human-like text, and perform
various tasks from translation to code generation. However, working
with LLMs requires careful consideration of token limits, context windows,
and efficient text processing strategies.
""" # Your long document

chunks = overlap_chunker(
  long_text,
  chunk_size=80,
  overlap_ratio=0.15
)

print(f"Split into {len(chunks)} chunks with overlap")

3. 嵌入与语义搜索

生成嵌入并查找相似内容：

from kerb.embedding import embed, cosine_similarity

# Generate embeddings
query_embedding = embed("machine learning algorithms", model=EmbeddingModel.ALL_MINILM_L6_V2)
doc_embedding = embed("neural networks and deep learning", model=EmbeddingModel.ALL_MINILM_L6_V2)

# Calculate similarity
similarity = cosine_similarity(query_embedding, doc_embedding)
print(f"Similarity: {similarity:.4f}")

4. 提示词模板

使用模板保持提示词一致性：

from kerb.prompt import render_template

template = """You are a {{role}} assistant.
Task: {{task}}
Context: {{context}}"""

prompt = render_template(template, {
    "role": "helpful Python",
    "task": "explain decorators",
    "context": "beginner level"
})

response = generate(prompt, model=ModelName.GPT_4O_MINI)

5. 文档加载

从各种格式加载文档：

from kerb.document import load_document

# Auto-detects format (txt, md, json, csv, pdf, etc.)
doc = load_document("data/report.pdf")

print(f"Content: {doc.content[:200]}...")
print(f"Metadata: {doc.metadata}")

6. LLM 缓存

降低成本和延迟：

from kerb.cache import create_memory_cache, generate_prompt_key
from kerb.generation import generate, ModelName

cache = create_memory_cache(max_size=1000, default_ttl=3600)

def cached_generate(prompt, model=ModelName.GPT_4O_MINI, temperature=0.7):
    cache_key = generate_prompt_key(
        prompt, 
        model=model.value, 
        temperature=temperature
    )
    
    if cached := cache.get(cache_key):
        return cached['response']
    
    response = generate(prompt, model=model, temperature=temperature)
    cache.set(cache_key, {'response': response, 'cost': response.cost})
    return response

# First call
response1 = cached_generate("Explain Python decorators briefly")

# Hit Cache
response2 = cached_generate("Explain Python decorators briefly")

7. 完整的 RAG 管道

将所有内容整合在一起：

from kerb.document import load_document
from kerb.chunk import overlap_chunker
from kerb.embedding import embed, embed_batch, cosine_similarity
from kerb.generation import generate, ModelName
from kerb.prompt import render_template

# 1. Load and chunk documents
doc = load_document("knowledge_base.txt")
chunks = overlap_chunker(doc.content, chunk_size=500, overlap_ratio=0.15)

# 2. Create embeddings
chunk_embeddings = embed_batch(chunks)

# 3. Query and retrieve relevant chunks
query = "Why is my chatbot hallucinating and how do I fix it?"
query_embedding = embed(query)

# Find most similar chunks
similarities = [cosine_similarity(query_embedding, emb) 
                for emb in chunk_embeddings]
top_indices = sorted(range(len(similarities)), 
                     key=lambda i: similarities[i], 
                     reverse=True)[:3]
relevant_chunks = [chunks[i] for i in top_indices]

# 4. Generate response with context
prompt = render_template("""Answer this question using the context below.

Context:
{{context}}

Question: {{question}}

Answer:""", {
    "context": "\n\n".join(relevant_chunks),
    "question": query
})

response = generate(prompt, model=ModelName.GPT_4O_MINI)
print(response.content)

8. 带工具的代理（高级）

创建一个可以使用工具的 AI 代理：

from kerb.agent.patterns import ReActAgent
from kerb.generation import generate, ModelName

def llm_function(prompt: str) -> str:
    """Connect agent to your LLM."""
    response = generate(prompt, model=ModelName.GPT_4O_MINI)
    return response.content

agent = ReActAgent(
    name="ResearchAgent",
    llm_func=llm_function,
    max_iterations=5
)

result = agent.run("Explain RAG like I'm a backend developer who just discovered AI exists")
print(result.output)

在 GitHub 上查看示例