Documentation Index
Fetch the complete documentation index at: https://docs.langsight.dev/llms.txt
Use this file to discover all available pages before exploring further.
Using LangChain, Langflow, LangGraph, or LangServe? See the dedicated LangChain integration —
LangSightLangChainCallback covers all four without wrapping individual MCP clients.Installation
LangSight is already installed — the SDK is part of the main package:The simplest integration (v0.13.0)
Two lines. LLM calls, MCP calls, and agent handoffs are all traced automatically:session() context:
wrap(), no wrap_llm(), no create_handoff().
What auto_patch() covers
| What | How | Since |
|---|---|---|
| LLM generation calls | Patches OpenAI, Anthropic, google.genai, google.generativeai at the class level | v0.11.0 |
| MCP tool calls | Patches mcp.ClientSession.call_tool — compatible with langchain_mcp_adapters, progress_callback, and any other kwargs | v0.12.0 |
| Agent handoffs | Auto-emits handoff spans when LLM selects call_*, delegate_*, invoke_*, transfer_to_*, run_*, dispatch_* tools | v0.12.0 |
Environment variables
auto_patch() reads these automatically. It returns None if LANGSIGHT_URL is not set — safe to call unconditionally.
Deferred init (v0.14.11):
auto_patch() installs patches immediately, but defers
LangSight client creation until the first span is emitted. This means you can call
auto_patch() before load_dotenv() — as long as env vars are set by the time the
first tool call fires, traces will be captured correctly. Loading .env before
auto_patch() is still recommended when possible.Multi-agent without any boilerplate
Before v0.12.0, instrumenting a multi-agent system required 15+ lines: manual session IDs, explicit wraps, and handoff spans for every agent. Now:create_handoff() needed because the tool name call_analyst triggers auto-detection.
Handoff auto-detection rules
LangSight inspects everyllm_intent span (tool selected by LLM). If the tool name matches:
| Pattern | Example | Target agent |
|---|---|---|
call_<agent> | call_analyst | analyst |
delegate_<agent> | delegate_billing | billing |
invoke_<agent> | invoke_researcher | researcher |
transfer_to_<agent> | transfer_to_support | support |
run_<agent> | run_summarizer | summarizer |
dispatch_<agent> | dispatch_validator | validator |
Context inheritance
Inside asession() block, the context variables _agent_ctx, _session_ctx, and _trace_ctx are set for the current asyncio task. Both wrap() and wrap_llm() read these as fallback when their params are not explicitly provided:
Shared proxies and bridges (v0.13.1 fix)
MCPClientProxy.call_tool() reads the active session() context at call time, not at proxy creation time. This means a single proxy or bridge object can be safely passed to multiple sub-agents — each sub-agent’s call_tool() invocations are attributed to whichever session() block is active when the call fires, not to the agent that created the proxy.
Capturing user input and agent output
session() accepts input= to record the initial human prompt and returns a SessionContext object with methods to capture the agent’s final response and any mid-session human messages. These map directly to Langfuse’s trace input/output concepts — if you’re familiar with Langfuse, the mental model is the same.
SessionContext subclasses str, so existing code that assigns it as session_id = sess continues to work without changes.
Level 1: Single-turn — capture question and answer
input= is stored as llm_input on the root agent span. set_output() stores the value as llm_output on the same span. Both are visible in the session detail panel in the dashboard and in langsight sessions --id <id>.
v0.14.1 — prompt flushed at open time. The
input= value is written to ClickHouse immediately when the session() block opens, not just at close. If the agent crashes or set_output() is never called, the prompt is still visible in the Sessions page. Sessions that complete normally are unaffected — the close-time span carrying both input and output is still emitted when the block exits.Level 2: Human-in-the-loop — mid-session human input
record_user_message(text) creates a user_message span with span_type="user_message" and llm_input=text. The span appears as a human-icon entry in the session timeline between the surrounding agent spans. Use this whenever the agent pauses for human input — approval gates, clarification questions, or any HITL checkpoint.
Level 3: Multi-turn conversations — link sessions with trace_id
Use a sharedtrace_id to link multiple sessions into a single conversation thread:
trace_id into a single conversation view, showing the full turn-by-turn history with inputs, outputs, and latencies.
SessionContext reference
| Method / attribute | Description |
|---|---|
sess.set_output(value) | Stores value as llm_output on the root session span. Call once at the end of the session. |
sess.record_user_message(text) | Creates a user_message span in the timeline with llm_input=text. Call each time the agent receives human input mid-session. |
str(sess) / sess as string | Returns the session ID — backward-compatible with code that used as session_id. |
No breaking changes
SessionContext subclasses str. All of the following existing patterns continue to work:
Coexistence with Langfuse
LangSight and Langfuse observe different layers and coexist without conflict:Upgrading from 0.11.x
Breaking behaviour change in
wrap() and wrap_llm() — not a signature change, but a behavioural one.In 0.11.x, calling ls.wrap(mcp) outside a session() block generated a new random session_id on every call, so MCP spans were disconnected from the LLM spans in the same run.In 0.12.0, both wrap() and wrap_llm() inherit agent_name, session_id, and trace_id from the active session() context when those params are not explicitly provided. You no longer need to thread them manually.session_id = str(uuid.uuid4())→ removed,session()generates itsession_id=session_idparam on everywrap()/wrap_llm()call → removedagent_name=...param onwrap()→ removed (inherited fromsession())
Quick start with init() + explicit wrapping
When you want explicit control over which sessions and agents are traced:
init() returns None if LANGSIGHT_URL is not set — safe to call unconditionally.
Manual MCP wrapping
For cases where you want explicit control (e.g. per-session circuit breakers or different redact policies per server):Full example
The recommended v0.12.0 pattern — zero manual instrumentation:How it works
Thewrap() method returns MCPClientProxy — a transparent proxy that intercepts two methods:
call_tool():
- Calls the original
call_tool()method - Records start time, end time, and latency
- Sends a
ToolCallSpantoPOST /api/traces/spansasynchronously (fire-and-forget) - Returns the original result unchanged
list_tools():
- Calls the original
list_tools()method - Posts the returned tool names, descriptions, and input schemas to
PUT /api/servers/{server_name}/toolsasynchronously (fire-and-forget) - Returns the original result unchanged
0 calls and their description.
Fail-open: if LangSight is unreachable, both call_tool() and list_tools() still succeed and the error is logged. Your agents are never blocked by monitoring.
Span metadata
PII redaction
Setredact_payloads=True on the client to omit input_args and output_result from every span. Use this when tool arguments or results may contain PII:
.langsight.yaml:
LLM token and cost fields
When wrapping a session that calls an LLM, attach token counts and the model ID to spans for cost tracking (Phase 7):gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.request.model. If you are sending OTLP traces, LangSight extracts these attributes automatically — no SDK changes needed.
Anthropic prompt cache tokens (v0.14.7)
ToolCallSpan carries two additional nullable fields for Anthropic prompt caching:
| Field | Type | Source | Dashboard label |
|---|---|---|---|
cache_read_tokens | int | None | usage.cache_read_input_tokens | Cache↗ (green) |
cache_creation_tokens | int | None | usage.cache_creation_input_tokens | Cache+ |
None when prompt caching is not active. The auto-patch for the Anthropic SDK populates these automatically — no manual instrumentation needed. When sending spans manually, pass the values from the Anthropic response:
latency_ms auto-computation
latency_ms on ToolCallSpan is now optional. If you omit it, LangSight auto-computes it from ended_at - started_at via a Pydantic model_validator. You can still pass an explicit value if you prefer — the validator only runs when latency_ms is None.
This means manual spans and OTLP ingestion no longer need to calculate latency separately.
Manual spans
Record spans without wrapping a client:Multi-agent tracing
Zero-boilerplate (v0.12.0 — recommended)
If your agents use tool names matching the handoff pattern (call_<agent>, delegate_<agent>, etc.), no explicit handoff code is needed:
call_research_agent triggers auto-detection.
With explicit handoff helpers
When you need precise control over handoff timing and parent linking, usecreate_handoff() + wrap_child_agent():
Advanced: manual pattern (full control, no session() context manager)
Use this pattern only when you cannot usesession() context manager (e.g. non-async code, or when session lifecycle is managed externally). Session IDs should come from a caller-controlled source, not uuid4 generation inline:
How parent_span_id works
parent_span_id uses the same model as OpenTelemetry distributed tracing. Each span has a unique span_id. When a child agent sets parent_span_id, LangSight can reconstruct the full call tree by following parent-child relationships from the flat span storage. No separate tree structure is required — tree reconstruction is a recursive query at read time.
Auto parent linking (LangChain/LangGraph)
When using the LangChain callback in auto-detect mode, parent linking happens automatically. The callback maintains a thread-local tool stack so that when a supervisor tool callsainvoke() on a sub-agent, the sub-agent’s spans are linked to the parent tool call without any manual parent_span_id wiring. See the LangChain integration for details.
Silent MCP error detection
The MCP SDK returns errors as JSON-RPC responses (isError=True on the result object) instead of raising Python exceptions. Without LangSight, these errors pass silently through your agent code — the agent sees a “successful” call that actually returned an error.
LangSight’s wrap() proxy detects result.isError and marks the span as status=error:
- Silent MCP errors show up as red in the dashboard trace view
- Session health tags correctly reflect failures (e.g.,
tool_failure) - Alert rules fire on error rate thresholds
- Circuit breakers count silent errors toward their failure threshold
Direct LLM SDK tracing (wrap_llm)
Afterauto_patch(), LLM clients are traced without wrap_llm() — you only need wrap_llm() when using init() without auto_patch(), or when you want per-client control over agent_name/session_id outside a session() context.
wrap_llm() accepts agent_name, session_id, and trace_id — all optional when inside a session() context (inherited from contextvars). See the Direct SDK integration page for full examples.
Prevention Guardrails
LangSight v0.3 adds a prevention layer that stops runaway agents before they waste tokens or cascade failures.Loop Detection
Detects when an agent calls the same tool with the same arguments repeatedly.LoopDetectedError when a loop is detected (with action="terminate").
Detection patterns: repetition (same call N times), ping-pong (A→B→A→B→A), retry-without-progress (same error repeated).
Budget Guardrails
Prevents sessions from exceeding cost, step count, or time budgets.BudgetExceededError when a limit is hit.
Circuit Breaker
Automatically disables a failing MCP server after N consecutive failures.CircuitBreakerOpenError when a tool call is blocked.
Per-server configuration in .langsight.yaml overrides SDK defaults:
Handling Prevention Errors
status="prevented" — visible in the dashboard trace view.
Server-Managed Configuration
Constructor params are offline fallbacks. When a LangSight backend is configured, thresholds are managed from the dashboard and applied automatically on eachwrap() call.
wrap() returns immediately — the remote config fetch runs as a background task and takes effect before the first tool call. If the API is unreachable, constructor params remain active.
Configure thresholds at Settings → Prevention in the dashboard, or via the API:
agent_name to "*" for a project-level default that applies to all agents without a specific config entry.