To enable computers to understand and compare document content based on meaning, text embeddings are used. These embeddings are numerical representations, or vectors, of text. They capture the semantic essence of the text, allowing systems to find document chunks that are thematically related to a user's query, even if direct keywords are not shared.
Kerb's embedding module provides a unified interface for generating these embeddings using various models, from local, dependency-free options for testing to high-performance models from providers like OpenAI.
The most direct way to generate an embedding is with the embed function. It takes a string of text and returns a vector, which is represented as a list of floating-point numbers.
Let's generate an embedding for a simple sentence and inspect its properties:
from kerb.embedding import embed, embedding_dimension, vector_magnitude
text = "Machine learning transforms data into insights"
embedding = embed(text)
print(f"Text: '{text}'")
print(f"Embedding dimension: {embedding_dimension(embedding)}")
print(f"Vector magnitude: {vector_magnitude(embedding):.6f}")
print(f"First 5 values: {[round(v, 4) for v in embedding[:5]]}")
This will produce an output similar to the following:
Text: 'Machine learning transforms data into insights'
Embedding dimension: 384
Vector magnitude: 1.000000
First 5 values: [0.034, -0.0121, 0.0589, 0.0076, -0.045]
By default, the embed function uses a local, hash-based method that requires no external dependencies or API calls. This is useful for prototyping and testing because it's fast and deterministic. However, it does not produce semantically meaningful vectors. For genuine semantic search, you'll need to use a more sophisticated model.
The embed function supports various models through its model parameter. You can specify a model using the EmbeddingModel enum, which provides a convenient, type-safe way to select from well-known local and API-based models.
For many applications, running an embedding model locally is a great option. It offers a balance of high-quality embeddings, privacy, and no API costs. The toolkit integrates with the sentence-transformers library to support this.
To use a Sentence Transformers model, you must first install the necessary dependency:
pip install sentence-transformers
Once installed, you can specify a model like EmbeddingModel.ALL_MINILM_L6_V2, which is a popular, well-balanced choice.
from kerb.embedding import embed, EmbeddingModel
# This code requires 'pip install sentence-transformers'
text = "Natural language processing enables AI to understand text"
st_embedding = embed(
text,
model=EmbeddingModel.ALL_MINILM_L6_V2
)
print(f"Sentence Transformers model generated an embedding with {len(st_embedding)} dimensions.")
For the highest-quality embeddings, you can use API-based models from providers like OpenAI. This requires an API key and incurs costs per usage, but it often yields the best performance for semantic retrieval.
First, ensure you have the OpenAI library installed:
pip install openai
You'll also need to set your OpenAI API key as an environment variable (OPENAI_API_KEY). Then, you can select an OpenAI model such as EmbeddingModel.TEXT_EMBEDDING_3_SMALL.
from kerb.embedding import embed, EmbeddingModel
# This code requires 'pip install openai' and an API key
text = "Natural language processing enables AI to understand text"
openai_embedding = embed(
text,
model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL
)
print(f"OpenAI model generated an embedding with {len(openai_embedding)} dimensions.")
Different models produce embeddings of different dimensions. For instance, ALL_MINILM_L6_V2 creates a 384-dimension vector, while OpenAI's TEXT_EMBEDDING_3_SMALL creates a 1536-dimension vector. While larger vectors can capture more details, they also require more storage and computational resources.
In a typical RAG system, you'll need to embed hundreds or thousands of document chunks. Calling embed in a loop for each chunk is inefficient, especially when using a GPU-accelerated local model or making API calls.
The embed_batch function is designed for this exact scenario. It processes a list of texts in a single, optimized call, significantly improving performance.
from kerb.embedding import embed_batch, EmbeddingModel
# A list of document chunks from the previous chapter
document_chunks = [
"Python is a high-level programming language",
"Machine learning models learn patterns from data",
"Natural language processing helps computers understand text",
"Deep neural networks have multiple layers",
"Data science combines statistics and programming"
]
# Generate embeddings for all chunks at once
# Note: This requires 'pip install sentence-transformers'
chunk_embeddings = embed_batch(
document_chunks,
model=EmbeddingModel.ALL_MINILM_L6_V2
)
print(f"Generated {len(chunk_embeddings)} embeddings.")
print(f"Each embedding has {len(chunk_embeddings[0])} dimensions.")
Using embed_batch is the standard practice for preparing a knowledge base for a RAG system. It ensures that your entire corpus of text chunks is efficiently converted into a collection of vectors, ready for storage and retrieval. With these vectors in hand, you are now prepared to perform mathematical comparisons to find semantically relevant information, which we will cover in the next section.
Was this section helpful?
sentence-transformers library.text-embedding-3-small, and practices for their use.© 2026 ApX Machine LearningEngineered with