趋近智
为现代 LLM 应用程序设计的实用工具集合。
简单
先进的 LLM 技术变得简单。它为复杂的操作提供干净、易于使用的界面。
轻量级
只安装您需要的。Kerb 是模块化的,因此您不必携带不必要的依赖项。
兼容
适用于任何 LLM 项目。Kerb 不是一个框架,而是一个可以与其他工具良好协作的工具包。
构建 LLM 应用程序所需的一切
Agent
用于多步推理的代理编排和执行模式。
Cache
响应和嵌入缓存以降低成本和延迟。
Chunk
用于优化上下文窗口和检索的文本分块工具。
Config
模型、提供商和应用程序设置的配置管理。
Context
上下文窗口管理和令牌预算跟踪。
Document
文档加载和处理,支持 PDF、网页等格式。
Embedding
嵌入生成和相似度搜索辅助工具。
Evaluation
LLM 输出的指标和基准测试工具。
Fine-Tuning
模型微调工具和大型数据集准备。
Generation
统一的 LLM 生成,支持多提供商(OpenAI、Anthropic、Gemini)。
Memory
用于有状态应用程序的对话记忆和实体跟踪。
Multimodal
用于多模态模型的图像、音频和视频处理。
Parsing
输出解析和验证(JSON、结构化数据、函数调用)。
Preprocessing
LLM 输入的文本清理和预处理。
Prompt
提示工程实用工具、模板和思维链模式。
Retrieval
用于语义检索的 RAG 和向量搜索工具。
Safety
内容审核和安全过滤器。
Testing
LLM 输出和评估的测试工具。
Tokenizer
任何模型的令牌计数和文本分割。
# Install just the basics (no dependencies)
pip install kerb
# Or install with the features you need
pip install kerb[generation] # For LLM generation
pip install kerb[embeddings] # For embeddings
pip install kerb[all] # Everything
使用任何主流 LLM 提供商生成文本:
from kerb.generation import generate, ModelName, LLMProvider
# Simple generation
response = generate(
"Write a haiku about Python programming",
model=ModelName.GPT_4O_MINI,
provider=LLMProvider.OPENAI
)
print(response.content)
print(f"Tokens: {response.usage.total_tokens}, Cost: $ {response.cost:.6f}")
将大型文档拆分以供 LLM 处理:
from kerb.chunk import overlap_chunker
long_text = """
Large Language Models have revolutionized natural language processing.
They can understand context, generate human-like text, and perform
various tasks from translation to code generation. However, working
with LLMs requires careful consideration of token limits, context windows,
and efficient text processing strategies.
""" # Your long document
chunks = overlap_chunker(
long_text,
chunk_size=80,
overlap_ratio=0.15
)
print(f"Split into {len(chunks)} chunks with overlap")
生成嵌入并查找相似内容:
from kerb.embedding import embed, cosine_similarity
# Generate embeddings
query_embedding = embed("machine learning algorithms", model=EmbeddingModel.ALL_MINILM_L6_V2)
doc_embedding = embed("neural networks and deep learning", model=EmbeddingModel.ALL_MINILM_L6_V2)
# Calculate similarity
similarity = cosine_similarity(query_embedding, doc_embedding)
print(f"Similarity: {similarity:.4f}")
使用模板保持提示词一致性:
from kerb.prompt import render_template
template = """You are a {{role}} assistant.
Task: {{task}}
Context: {{context}}"""
prompt = render_template(template, {
"role": "helpful Python",
"task": "explain decorators",
"context": "beginner level"
})
response = generate(prompt, model=ModelName.GPT_4O_MINI)
从各种格式加载文档:
from kerb.document import load_document
# Auto-detects format (txt, md, json, csv, pdf, etc.)
doc = load_document("data/report.pdf")
print(f"Content: {doc.content[:200]}...")
print(f"Metadata: {doc.metadata}")
降低成本和延迟:
from kerb.cache import create_memory_cache, generate_prompt_key
from kerb.generation import generate, ModelName
cache = create_memory_cache(max_size=1000, default_ttl=3600)
def cached_generate(prompt, model=ModelName.GPT_4O_MINI, temperature=0.7):
cache_key = generate_prompt_key(
prompt,
model=model.value,
temperature=temperature
)
if cached := cache.get(cache_key):
return cached['response']
response = generate(prompt, model=model, temperature=temperature)
cache.set(cache_key, {'response': response, 'cost': response.cost})
return response
# First call
response1 = cached_generate("Explain Python decorators briefly")
# Hit Cache
response2 = cached_generate("Explain Python decorators briefly")
将所有内容整合在一起:
from kerb.document import load_document
from kerb.chunk import overlap_chunker
from kerb.embedding import embed, embed_batch, cosine_similarity
from kerb.generation import generate, ModelName
from kerb.prompt import render_template
# 1. Load and chunk documents
doc = load_document("knowledge_base.txt")
chunks = overlap_chunker(doc.content, chunk_size=500, overlap_ratio=0.15)
# 2. Create embeddings
chunk_embeddings = embed_batch(chunks)
# 3. Query and retrieve relevant chunks
query = "Why is my chatbot hallucinating and how do I fix it?"
query_embedding = embed(query)
# Find most similar chunks
similarities = [cosine_similarity(query_embedding, emb)
for emb in chunk_embeddings]
top_indices = sorted(range(len(similarities)),
key=lambda i: similarities[i],
reverse=True)[:3]
relevant_chunks = [chunks[i] for i in top_indices]
# 4. Generate response with context
prompt = render_template("""Answer this question using the context below.
Context:
{{context}}
Question: {{question}}
Answer:""", {
"context": "\n\n".join(relevant_chunks),
"question": query
})
response = generate(prompt, model=ModelName.GPT_4O_MINI)
print(response.content)
创建一个可以使用工具的 AI 代理:
from kerb.agent.patterns import ReActAgent
from kerb.generation import generate, ModelName
def llm_function(prompt: str) -> str:
"""Connect agent to your LLM."""
response = generate(prompt, model=ModelName.GPT_4O_MINI)
return response.content
agent = ReActAgent(
name="ResearchAgent",
llm_func=llm_function,
max_iterations=5
)
result = agent.run("Explain RAG like I'm a backend developer who just discovered AI exists")
print(result.output)