While LangSmith provides invaluable, purpose-built tools for tracing and debugging LangChain applications, production environments often demand integration with broader, pre-existing observability platforms. Many organizations have standardized on systems like Datadog, Grafana/Prometheus, Splunk, Jaeger, or Honeycomb to gain a unified view across their entire technology stack. Integrating LangChain application monitoring into these platforms allows you to correlate LLM application behavior with infrastructure performance, other microservices, and business metrics, leveraging existing alerting and incident management workflows.
This section describes how to channel the operational data from your LangChain applications, logs, metrics, and traces, into these third-party systems.
Effective observability typically relies on three data types:
LangChain utilizes Python's standard logging library. This makes integration relatively straightforward. You can configure Python's logging handlers to forward logs to various destinations supported by your chosen platform.
Common approaches include:
datadog_api_client.v2.logs, libraries for Splunk HEC, or standard handlers like logging.handlers.SysLogHandler or logging.FileHandler monitored by agents like Fluentd or Logstash).python-json-logger can help.# Example: Basic configuration for JSON logging
import logging
import sys
from pythonjsonlogger import jsonlogger
# Get the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# Use a stream handler to output to stdout (can be collected by agents)
logHandler = logging.StreamHandler(sys.stdout)
# Use the JSON formatter
formatter = jsonlogger.JsonFormatter('%(asctime)s %(name)s %(levelname)s %(message)s')
logHandler.setFormatter(formatter)
# Add the handler
logger.addHandler(logHandler)
# Now, logs from LangChain (and your app) using the standard logger will be in JSON
logging.info("Application started.")
# Example LangChain component logging
# (Assuming LangChain components use the standard logging internally)
# try:
# result = my_chain.invoke({"input": "some query"})
# except Exception as e:
# logging.error("Chain execution failed", exc_info=True)
Ensure logs sent to third parties are scrubbed of sensitive information (PII, API keys) unless the platform has specific, secure handling mechanisms approved for such data.
Metrics provide quantifiable insights into performance and resource consumption. Integrating LangChain metrics involves instrumenting your application to collect relevant data points and exporting them.
BaseCallbackHandler, you can separate your measurement logic from your application logic. This handler can intercept events such as on_llm_start and on_llm_end to calculate latency and extract token usage without modifying the structure of your chains.prometheus_client, datadog, statsd) within your callback handler to send metrics. These libraries typically allow you to define metric types (Counters, Gauges, Histograms) and push data to the platform or expose an endpoint for scraping.# Example: Instrumenting LLM call latency with Prometheus using Callbacks
import time
from typing import Any, Dict, List
from prometheus_client import Histogram, Counter, start_http_server
from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.outputs import LLMResult
from langchain_openai import ChatOpenAI
# Define Prometheus metrics
LLM_LATENCY = Histogram(
'langchain_llm_latency_seconds',
'Latency of LLM calls in seconds',
['model']
)
TOKEN_USAGE = Counter(
'langchain_token_usage_total',
'Token usage count',
['model', 'type']
)
class PrometheusMetricsHandler(BaseCallbackHandler):
"""Callback Handler that captures metrics for Prometheus."""
def __init__(self):
self.start_times: Dict[str, float] = {}
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> Any:
"""Run when LLM starts running."""
# Use run_id to handle concurrent calls safely
run_id = kwargs.get("run_id")
self.start_times[run_id] = time.time()
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> Any:
"""Run when LLM ends running."""
run_id = kwargs.get("run_id")
start_time = self.start_times.pop(run_id, None)
# Determine model name from response or fallback
model = response.llm_output.get("model_name", "unknown")
if start_time:
latency = time.time() - start_time
LLM_LATENCY.labels(model=model).observe(latency)
# Record token usage if available
if response.llm_output and "token_usage" in response.llm_output:
usage = response.llm_output["token_usage"]
TOKEN_USAGE.labels(model=model, type="prompt").inc(usage.get("prompt_tokens", 0))
TOKEN_USAGE.labels(model=model, type="completion").inc(usage.get("completion_tokens", 0))
# Start Prometheus client HTTP server (typically done once at app startup)
# start_http_server(8000)
# Initialize LLM with the callback handler
llm = ChatOpenAI(
model="gpt-3.5-turbo",
callbacks=[PrometheusMetricsHandler()]
)
# Now uses of llm will automatically record metrics
# result = llm.invoke("Explain observability")
Important metrics to consider exporting:
Modern tracing often relies on OpenTelemetry (OTel), an open standard for generating and collecting telemetry data. LangChain integrates well with the OpenTelemetry ecosystem, often via auto-instrumentation packages.
Integrating with platforms like Jaeger, Tempo, Honeycomb, or Datadog APM typically involves:
Installing OTel Packages: Add the necessary OpenTelemetry API, SDK, exporter, and instrumentation packages to your project.
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp opentelemetry-instrumentation-langchain
Configuring an Exporter: Configure the OTel SDK to export trace data to your chosen backend. This usually involves setting environment variables or configuring the SDK programmatically to point to the backend's OTel endpoint (often an OTel Collector or the platform's direct ingestion endpoint).
OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_SERVICE_NAME, etc.Enabling Auto-Instrumentation: Use an instrumentation library such as opentelemetry-instrumentation-langchain (often provided by third parties like Traceloop or OpenInference). These libraries automatically patch LangChain's internal execution methods to generate spans for chains, LLMs, and retrievers without manual tracer code.
The primary benefit here is distributed tracing: seeing a single request's path not just within the LangChain application but also across other services it interacts with (e.g., an initial API gateway, subsequent microservices called by tools).
Flow of observability data from a LangChain application through an optional collector to specialized backend platforms. Direct integration from the application to backends is also possible.
The choice of observability platform often depends on existing tooling within your organization. However, consider:
trace_id).Integrating LangChain application monitoring into your organization's standard observability stack provides a comprehensive understanding of its behavior in the context of the larger system. It uses existing investments in tooling and expertise, enabling faster troubleshooting, performance optimization, and more reliable operations for your production LLM applications.
Cleaner syntax. Built-in debugging. Production-ready from day one.
Built for the AI systems behind ApX Machine Learning
Was this section helpful?
logging module, for understanding how to configure loggers, handlers, formatters, and filters for effective application logging.© 2026 ApX Machine LearningEngineered with