Pinecone is a popular managed vector database service, designed specifically for high-performance similarity search applications. It abstracts away much of the underlying infrastructure management, allowing developers to focus on integrating vector search capabilities into their systems. Interaction with Pinecone is primarily done through its client libraries, with the Python client being a common choice for machine learning and data science workflows.
To begin working with Pinecone, you'll need an API key and your environment information, which you can obtain after signing up for a Pinecone account.
First, install the official Pinecone Python client library:
pip install pinecone-client
Once installed, you can initialize the connection to your Pinecone project. It's best practice to store your API key and environment securely, for example, as environment variables, rather than hardcoding them directly in your script.
import os
from pinecone import Pinecone, ServerlessSpec, PodSpec
# Initialize connection to Pinecone
# Assumes PINECONE_API_KEY and PINECONE_ENVIRONMENT are set as environment variables
api_key = os.environ.get("PINECONE_API_KEY")
environment = os.environ.get("PINECONE_ENVIRONMENT") # e.g., 'gcp-starter' or region like 'us-west1-gcp'
if not api_key or not environment:
raise ValueError("Please set the PINECONE_API_KEY and PINECONE_ENVIRONMENT environment variables.")
# Initialize the Pinecone client
pc = Pinecone(api_key=api_key)
print("Pinecone client initialized.")
# You can verify the connection, e.g., by listing indexes (see below)
In Pinecone, your vectors are stored within an 'index'. An index is characterized by its name, the dimensionality of the vectors it will store, and the distance metric used for similarity calculations (e.g., cosine
, euclidean
, dotproduct
). You also need to specify the type of infrastructure (pods or serverless) when creating an index.
Let's create a new index named semantic-search-demo
to store 768-dimensional vectors using the cosine similarity metric. We'll use a basic serverless configuration for this example.
import time
index_name = "semantic-search-demo"
vector_dimension = 768
similarity_metric = "cosine"
# Check if the index already exists
if index_name not in pc.list_indexes().names:
print(f"Creating index '{index_name}'...")
pc.create_index(
name=index_name,
dimension=vector_dimension,
metric=similarity_metric,
spec=ServerlessSpec(
cloud='aws', # Or 'gcp', 'azure'
region='us-west-2' # Choose a supported region
)
# Or for pod-based indexes:
# spec=PodSpec(
# environment=environment, # Your project environment
# pod_type="p1.x1", # Example pod type
# pods=1
# )
)
# Wait for index to be ready
while not pc.describe_index(index_name).status['ready']:
print("Waiting for index to initialize...")
time.sleep(5)
print(f"Index '{index_name}' created successfully.")
else:
print(f"Index '{index_name}' already exists.")
# Connect to the index
index = pc.Index(index_name)
print(index.describe_index_stats())
You can easily manage your indexes:
# List all indexes in your project
index_list = pc.list_indexes()
print("Available indexes:", index_list.names)
# Delete an index (use with caution!)
# index_to_delete = "some-old-index"
# if index_to_delete in index_list.names:
# print(f"Deleting index '{index_to_delete}'...")
# pc.delete_index(index_to_delete)
# print("Deletion initiated. It might take a few moments.")
With an index object (index = pc.Index(index_name)
), you can perform operations on the vectors within that index.
The upsert
operation adds new vectors or updates existing vectors (identified by their id
) within the index. You typically provide a list of tuples, where each tuple contains (id, vector_values, metadata)
. Metadata is an optional dictionary containing additional information associated with the vector, which can be used for filtering.
For efficiency, it's highly recommended to upsert data in batches rather than one vector at a time. Pinecone has limits on the size of each upsert request (e.g., number of vectors or total request size).
import random
# Example data (replace with your actual embeddings)
vectors_to_upsert = []
num_vectors = 100
for i in range(num_vectors):
vector_id = f"vec_{i}"
vector_values = [random.random() for _ in range(vector_dimension)] # 768 dimensions
metadata = {
"genre": random.choice(["fiction", "non-fiction"]),
"year": 2020 + random.randint(0, 3),
"source_doc": f"doc_{i // 10}" # Group vectors by source document
}
vectors_to_upsert.append((vector_id, vector_values, metadata))
# Upsert in batches (example batch size 50)
batch_size = 50
print(f"Upserting {len(vectors_to_upsert)} vectors in batches of {batch_size}...")
for i in range(0, len(vectors_to_upsert), batch_size):
batch = vectors_to_upsert[i:i + batch_size]
upsert_response = index.upsert(vectors=batch)
print(f"Upserted batch {i // batch_size + 1}, response: {upsert_response}")
print("Upsert operation complete.")
# Verify index size
print(index.describe_index_stats())
The core operation is query
, which finds the top_k
vectors in the index that are most similar to a given vector
(the query vector), according to the index's distance metric.
# Generate a sample query vector (should match index dimension)
query_vector = [random.random() for _ in range(vector_dimension)]
# Perform a similarity search
k = 5 # Number of nearest neighbors to retrieve
print(f"Querying for the top {k} neighbors...")
query_response = index.query(
vector=query_vector,
top_k=k,
include_metadata=True, # Retrieve metadata along with IDs and scores
include_values=False # Optionally retrieve vector values (usually False for performance)
)
print("Query Results:")
if query_response.matches:
for match in query_response.matches:
print(f" ID: {match.id}, Score: {match.score:.4f}, Metadata: {match.metadata}")
else:
print(" No matches found.")
The score
returned depends on the metric used (e.g., for cosine similarity, higher is better; for Euclidean distance, lower is better).
You can refine your search by applying filters based on the metadata associated with your vectors. Pinecone supports various filtering operations (equality, inequality, range queries, set membership).
# Query for vectors similar to query_vector, but only those from the 'fiction' genre published in 2022
print(f"\nQuerying top {k} 'fiction' neighbors from 2022...")
filter_criteria = {
"genre": {"$eq": "fiction"},
"year": {"$eq": 2022}
}
filtered_query_response = index.query(
vector=query_vector,
top_k=k,
filter=filter_criteria,
include_metadata=True
)
print("Filtered Query Results:")
if filtered_query_response.matches:
for match in filtered_query_response.matches:
print(f" ID: {match.id}, Score: {match.score:.4f}, Metadata: {match.metadata}")
else:
print(" No matches found matching the filter criteria.")
This combination of semantic similarity search and structured metadata filtering is a powerful feature of vector databases.
You can also retrieve or delete specific vectors using their unique IDs.
# Fetch vectors by ID
ids_to_fetch = ["vec_10", "vec_25"]
print(f"\nFetching vectors by ID: {ids_to_fetch}")
fetch_response = index.fetch(ids=ids_to_fetch)
# print(fetch_response) # Uncomment to see the full response structure
for vec_id, vec_data in fetch_response.vectors.items():
print(f" Fetched ID: {vec_id}, Metadata: {vec_data.metadata}") # Values are also available in vec_data.values
# Delete vectors by ID
ids_to_delete = ["vec_5", "vec_15"]
print(f"\nDeleting vectors by ID: {ids_to_delete}")
delete_response = index.delete(ids=ids_to_delete)
print(f"Deletion response: {delete_response}") # Response is typically empty on success {}
# Verify deletion by trying to fetch again or checking stats
time.sleep(2) # Deletes can take a moment to reflect in stats
print(index.describe_index_stats())
try:
fetch_deleted = index.fetch(ids=ids_to_delete)
print("Deleted vectors found (unexpected):", fetch_deleted.vectors.keys())
except Exception as e: # Pinecone might handle fetches for non-existent IDs differently
print(f"Attempted fetch of deleted IDs ({ids_to_delete}) likely resulted in empty or error state, as expected.")
Working with the Pinecone client involves these core steps: initializing the connection, selecting or creating an index, and then using methods like upsert
, query
, fetch
, and delete
to manage and search your vector data. As a managed service, Pinecone simplifies deployment and scaling, making it a strong option when you prefer not to manage the database infrastructure yourself. The hands-on practical later in this chapter will guide you through integrating this client into a more complete application.
© 2025 ApX Machine Learning