Python SDK - LangSight

Using LangChain, Langflow, LangGraph, or LangServe? See the dedicated LangChain integration — LangSightLangChainCallback covers all four without wrapping individual MCP clients.

Installation

LangSight is already installed — the SDK is part of the main package:

pip install langsight
# or
uv add langsight

The simplest integration (v0.13.0)

Two lines. LLM calls, MCP calls, and agent handoffs are all traced automatically:

import langsight
langsight.auto_patch()  # LLM + MCP + handoffs — all automatic

Then wrap your agent logic in a session() context:

async with langsight.session(agent_name="orchestrator") as session_id:
    client = OpenAI()                           # LLM calls: auto-traced
    result = await mcp_session.call_tool(...)   # MCP calls: auto-traced
    # When LLM selects tool "call_analyst" → handoff span: auto-emitted

That is the complete integration. No wrap(), no wrap_llm(), no create_handoff().

What auto_patch() covers

What	How	Since
LLM generation calls	Patches `OpenAI`, `Anthropic`, `google.genai`, `google.generativeai` at the class level	v0.11.0
MCP tool calls	Patches `mcp.ClientSession.call_tool` — compatible with `langchain_mcp_adapters`, `progress_callback`, and any other kwargs	v0.12.0
Agent handoffs	Auto-emits handoff spans when LLM selects `call_`, `delegate_`, `invoke_`, `transfer_to_`, `run_`, `dispatch_` tools	v0.12.0

Environment variables

export LANGSIGHT_URL=http://localhost:8000
export LANGSIGHT_API_KEY=ls_your_key
export LANGSIGHT_PROJECT_ID=my-project

auto_patch() reads these automatically. It returns None if LANGSIGHT_URL is not set — safe to call unconditionally.

Deferred init (v0.14.11): auto_patch() installs patches immediately, but defers LangSight client creation until the first span is emitted. This means you can call auto_patch() before load_dotenv() — as long as env vars are set by the time the first tool call fires, traces will be captured correctly. Loading .env before auto_patch() is still recommended when possible.

Multi-agent without any boilerplate

Before v0.12.0, instrumenting a multi-agent system required 15+ lines: manual session IDs, explicit wraps, and handoff spans for every agent. Now:

import langsight
langsight.auto_patch()  # call once at startup

async def orchestrator(question: str):
    async with langsight.session(agent_name="orchestrator"):
        # LLM calls: auto-traced
        # If LLM selects "call_analyst" tool → handoff span auto-emitted
        response = await llm.generate(question, tools=[call_analyst_tool])

async def analyst(question: str):
    async with langsight.session(agent_name="analyst"):
        # MCP call: auto-traced, attributed to "analyst"
        data = await mcp.call_tool("search_products", {"q": question})
        return data

The dashboard session graph shows analyst as a child of orchestrator with a solid edge — no create_handoff() needed because the tool name call_analyst triggers auto-detection.

Handoff auto-detection rules

LangSight inspects every llm_intent span (tool selected by LLM). If the tool name matches:

Pattern	Example	Target agent
`call_<agent>`	`call_analyst`	`analyst`
`delegate_<agent>`	`delegate_billing`	`billing`
`invoke_<agent>`	`invoke_researcher`	`researcher`
`transfer_to_<agent>`	`transfer_to_support`	`support`
`run_<agent>`	`run_summarizer`	`summarizer`
`dispatch_<agent>`	`dispatch_validator`	`validator`

Self-handoffs (target == source agent) are silently suppressed.

Context inheritance

Inside a session() block, the context variables _agent_ctx, _session_ctx, and _trace_ctx are set for the current asyncio task. Both wrap() and wrap_llm() read these as fallback when their params are not explicitly provided:

langsight.auto_patch()

async with langsight.session(agent_name="analyst") as session_id:
    # All of these are equivalent — context inherited automatically:
    traced_a = ls.wrap(mcp_a, server_name="catalog")           # agent_name + session_id from ctx
    traced_b = ls.wrap(mcp_b, server_name="crm")               # same
    client   = ls.wrap_llm(OpenAI())                           # same
    # Or skip wrap() entirely — auto_patch() covers mcp.call_tool directly

Shared proxies and bridges (v0.13.1 fix)

MCPClientProxy.call_tool() reads the active session() context at call time, not at proxy creation time. This means a single proxy or bridge object can be safely passed to multiple sub-agents — each sub-agent’s call_tool() invocations are attributed to whichever session() block is active when the call fires, not to the agent that created the proxy.

langsight.auto_patch()

# A shared bridge created once in the orchestrator
shared_bridge = ls.wrap(mcp_session, server_name="shared-mcp")

async def orchestrator():
    async with langsight.session(agent_name="orchestrator"):
        await shared_bridge.call_tool("tool_a", {})   # attributed to "orchestrator"

async def analyst():
    async with langsight.session(agent_name="analyst"):
        await shared_bridge.call_tool("tool_b", {})   # attributed to "analyst"

Before v0.13.1, both calls above would be attributed to whichever agent ran first (the agent in whose context the proxy was created). This produced incorrect attribution in the session graph when shared proxies were used.

Capturing user input and agent output

input= and sess.set_output() apply to every agent, not just the orchestrator. Without them, the Input/Output panel in the session graph will be empty for that agent’s node. This is the most common reason the right panel shows no data after implementing multi-agent tracing.

session() accepts input= to record the initial human prompt and returns a SessionContext object with methods to capture the agent’s final response and any mid-session human messages. These map directly to Langfuse’s trace input/output concepts — if you’re familiar with Langfuse, the mental model is the same. SessionContext subclasses str, so existing code that assigns it as session_id = sess continues to work without changes.

Level 1: Single-turn — capture question and answer

async with langsight.session(
    agent_name="orchestrator",
    input="What products need restocking?",
) as sess:
    result = await agent.run(sess)   # sess is usable as a string (session_id)
    sess.set_output(result)

input= is stored as llm_input on the root agent span. set_output() stores the value as llm_output on the same span. Both are visible in the session detail panel in the dashboard and in langsight sessions --id <id>.

v0.14.1 — prompt flushed at open time. The input= value is written to ClickHouse immediately when the session() block opens, not just at close. If the agent crashes or set_output() is never called, the prompt is still visible in the Sessions page. Sessions that complete normally are unaffected — the close-time span carrying both input and output is still emitted when the block exits.

Level 2: Human-in-the-loop — mid-session human input

async with langsight.session(agent_name="orchestrator") as sess:
    analysis = await agent.analyze(question)
    approval = await ask_human("Order 50 units of SKU-42?")
    sess.record_user_message(approval)   # creates a user_message span in the timeline
    result = await agent.execute(approval)
    sess.set_output(result)

record_user_message(text) creates a user_message span with span_type="user_message" and llm_input=text. The span appears as a human-icon entry in the session timeline between the surrounding agent spans. Use this whenever the agent pauses for human input — approval gates, clarification questions, or any HITL checkpoint.

Level 3: Multi-turn conversations — link sessions with trace_id

Use a shared trace_id to link multiple sessions into a single conversation thread:

import uuid
import langsight

trace_id = str(uuid.uuid4())   # generate once per conversation

# Turn 1
async with langsight.session(
    agent_name="support-agent",
    input=turn1_question,
    trace_id=trace_id,
) as sess:
    result = await agent.run(turn1_question)
    sess.set_output(result)

# Turn 2 — linked to Turn 1 by shared trace_id
async with langsight.session(
    agent_name="support-agent",
    input=turn2_question,
    trace_id=trace_id,
) as sess:
    result = await agent.run(turn2_question)
    sess.set_output(result)

The dashboard groups all sessions sharing a trace_id into a single conversation view, showing the full turn-by-turn history with inputs, outputs, and latencies.

SessionContext reference

Method / attribute	Description
`sess.set_output(value)`	Stores `value` as `llm_output` on the root session span. Call once at the end of the session.
`sess.record_user_message(text)`	Creates a `user_message` span in the timeline with `llm_input=text`. Call each time the agent receives human input mid-session.
`str(sess)` / `sess` as string	Returns the session ID — backward-compatible with code that used `as session_id`.

No breaking changes

SessionContext subclasses str. All of the following existing patterns continue to work:

# All still valid in v0.13.0
async with langsight.session(agent_name="my-agent") as session_id:
    traced = ls.wrap(mcp, server_name="postgres-mcp", session_id=session_id)

Coexistence with Langfuse

LangSight and Langfuse observe different layers and coexist without conflict:

from langfuse.decorators import observe
import langsight

langsight.auto_patch()  # MCP + handoffs + lineage (zero decorators)

@observe()              # LLM prompt/completion tracing (Langfuse)
async def my_agent(query: str) -> str:
    # Langfuse traces the LLM call (prompt, completion, cost)
    # LangSight also traces the LLM call (token counts, tool spans)
    response = await llm.generate(...)

    # LangSight traces this; Langfuse does not (MCP is not a Langfuse concept)
    data = await mcp.call_tool("search", {"q": query})

    return response

LangSight focuses on MCP health, agent handoffs, loop detection, and circuit breakers. Langfuse focuses on prompt engineering and LLM evals. Use both.

Upgrading from 0.11.x

Breaking behaviour change in wrap() and wrap_llm() — not a signature change, but a behavioural one.In 0.11.x, calling ls.wrap(mcp) outside a session() block generated a new random session_id on every call, so MCP spans were disconnected from the LLM spans in the same run.In 0.12.0, both wrap() and wrap_llm() inherit agent_name, session_id, and trace_id from the active session() context when those params are not explicitly provided. You no longer need to thread them manually.

Before 0.12.0 — manual param threading (verbose, error-prone):

session_id = str(uuid.uuid4())
tokens = langsight.set_context(session_id=session_id, agent_name="orchestrator")

client  = ls.wrap_llm(raw_client, agent_name="orchestrator", session_id=session_id)
catalog = ls.wrap(catalog_mcp, server_name="catalog", agent_name="analyst", session_id=session_id)

langsight.clear_context(tokens)

After 0.12.0 — context flows automatically:

async with langsight.session(agent_name="orchestrator") as session_id:
    client  = ls.wrap_llm(raw_client)                       # inherits agent_name + session_id
    catalog = ls.wrap(catalog_mcp, server_name="catalog")   # same
    # Or skip wrap() entirely — MCP auto-patch covers call_tool directly

Three things you can remove from your existing code:

session_id = str(uuid.uuid4()) → removed, session() generates it
session_id=session_id param on every wrap() / wrap_llm() call → removed
agent_name=... param on wrap() → removed (inherited from session())

Quick start with `init()` + explicit wrapping

When you want explicit control over which sessions and agents are traced:

import langsight

ls = langsight.init()  # reads LANGSIGHT_URL, LANGSIGHT_API_KEY, LANGSIGHT_PROJECT_ID from env

async with langsight.session(agent_name="analyst") as session_id:
    # Both wrap() and wrap_llm() inherit session_id + agent_name from context
    traced = ls.wrap(mcp_session, server_name="my-server")   # no session_id= needed
    client = ls.wrap_llm(genai.Client())                      # no agent_name= needed

init() returns None if LANGSIGHT_URL is not set — safe to call unconditionally.

Manual MCP wrapping

For cases where you want explicit control (e.g. per-session circuit breakers or different redact policies per server):

from langsight.sdk import LangSightClient

ls = LangSightClient(
    url="http://localhost:8000",
    project_id="my-project",
)

# Explicit wrap — agent_name, session_id, trace_id still optional when inside session()
traced_mcp = ls.wrap(mcp_session, server_name="postgres-mcp")

# All call_tool() calls are now traced
result = await traced_mcp.call_tool("query", {"sql": "SELECT 1"})

Full example

The recommended v0.12.0 pattern — zero manual instrumentation:

import asyncio
import langsight
from mcp import ClientSession
from mcp.client.stdio import stdio_client, StdioServerParameters
from openai import OpenAI

# One-time setup
langsight.auto_patch()  # LLM + MCP + handoffs — all automatic

async def support_agent(question: str) -> str:
    async with langsight.session(agent_name="support-agent") as session_id:
        # Connect to MCP server — call_tool() auto-traced, no wrap() needed
        params = StdioServerParameters(command="python", args=["postgres-mcp/server.py"])
        async with stdio_client(params) as (read, write):
            async with ClientSession(read, write) as mcp_session:
                await mcp_session.initialize()

                # MCP tool call: auto-traced with agent="support-agent", session=session_id
                orders = await mcp_session.call_tool(
                    "query", {"sql": "SELECT COUNT(*) FROM orders"}
                )

                # LLM call: auto-traced
                client = OpenAI()
                response = client.chat.completions.create(
                    model="gpt-4o",
                    messages=[
                        {"role": "user", "content": f"Orders data: {orders}\n{question}"}
                    ],
                )
                return response.choices[0].message.content

asyncio.run(support_agent("How many orders were placed today?"))

For explicit wrapping (full control over server_name, redact_payloads per server, etc.):

import asyncio
from mcp import ClientSession
from mcp.client.stdio import stdio_client, StdioServerParameters
from langsight.sdk import LangSightClient

async def main():
    ls = LangSightClient(
        url="http://localhost:8000",
        api_key="your-api-key",      # required in production (X-API-Key header)
        project_id="my-project",     # required for multi-project deployments
        redact_payloads=False,       # set True to omit input_args/output_result from spans
    )

    params = StdioServerParameters(command="python", args=["server.py"])
    async with stdio_client(params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Explicit wrap — agent_name/session_id optional when inside session()
            traced = ls.wrap(
                session,
                server_name="postgres-mcp",
                agent_name="support-agent",
                session_id="sess-abc123",
            )

            result = await traced.call_tool("query", {"sql": "SELECT COUNT(*) FROM orders"})
            print(result)

asyncio.run(main())

How it works

The wrap() method returns MCPClientProxy — a transparent proxy that intercepts two methods: call_tool():

Calls the original call_tool() method
Records start time, end time, and latency
Sends a ToolCallSpan to POST /api/traces/spans asynchronously (fire-and-forget)
Returns the original result unchanged

list_tools():

Calls the original list_tools() method
Posts the returned tool names, descriptions, and input schemas to PUT /api/servers/{server_name}/tools asynchronously (fire-and-forget)
Returns the original result unchanged

This means the Tools tab in the MCP Servers catalog populates automatically the first time an instrumented agent runs — no health checker or manual registration is needed. Tools declared in the server schema but never called appear with 0 calls and their description. Fail-open: if LangSight is unreachable, both call_tool() and list_tools() still succeed and the error is logged. Your agents are never blocked by monitoring.

Span metadata

traced = langsight.wrap(
    session,
    server_name="my-mcp",      # required — matches MCPServer.name in config
    agent_name="my-agent",     # shown in investigate output
    session_id="sess-123",     # groups spans into a conversation
    trace_id="trace-abc",      # links to a parent trace
)

PII redaction

Set redact_payloads=True on the client to omit input_args and output_result from every span. Use this when tool arguments or results may contain PII:

langsight = LangSightClient(
    url="http://localhost:8000",
    project_id="my-project",
    redact_payloads=True,
)

You can also set this globally in .langsight.yaml:

redact_payloads: true

redact_payloads: true means payload panels in the Trace tab will be empty. Span timing, status, and counts are still captured.

LLM token and cost fields

When wrapping a session that calls an LLM, attach token counts and the model ID to spans for cost tracking (Phase 7):

from langsight.sdk.models import ToolCallSpan, ToolCallStatus

await langsight.send_span(ToolCallSpan.record(
    server_name="openai-mcp",
    tool_name="chat_completion",
    started_at=started,
    status=ToolCallStatus.SUCCESS,
    input_tokens=512,
    output_tokens=128,
    model_id="gpt-4o",
))

These map directly to OTLP semantic conventions: gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.request.model. If you are sending OTLP traces, LangSight extracts these attributes automatically — no SDK changes needed.

Anthropic prompt cache tokens (v0.14.7)

ToolCallSpan carries two additional nullable fields for Anthropic prompt caching:

Field	Type	Source	Dashboard label
`cache_read_tokens`	`int \| None`	`usage.cache_read_input_tokens`	Cache↗ (green)
`cache_creation_tokens`	`int \| None`	`usage.cache_creation_input_tokens`	Cache+

Both fields are None when prompt caching is not active. The auto-patch for the Anthropic SDK populates these automatically — no manual instrumentation needed. When sending spans manually, pass the values from the Anthropic response:

await langsight.send_span(ToolCallSpan.record(
    server_name="anthropic",
    tool_name="messages_create",
    started_at=started,
    status=ToolCallStatus.SUCCESS,
    input_tokens=response.usage.input_tokens,
    output_tokens=response.usage.output_tokens,
    cache_read_tokens=response.usage.cache_read_input_tokens,    # None if not cached
    cache_creation_tokens=response.usage.cache_creation_input_tokens,  # None if no write
    model_id="claude-sonnet-4-6",
))

`latency_ms` auto-computation

latency_ms on ToolCallSpan is now optional. If you omit it, LangSight auto-computes it from ended_at - started_at via a Pydantic model_validator. You can still pass an explicit value if you prefer — the validator only runs when latency_ms is None. This means manual spans and OTLP ingestion no longer need to calculate latency separately.

Manual spans

Record spans without wrapping a client:

from datetime import UTC, datetime
from langsight.sdk.models import ToolCallSpan, ToolCallStatus

started = datetime.now(UTC)
# ... your tool call ...

await langsight.send_span(ToolCallSpan.record(
    server_name="my-server",
    tool_name="custom_tool",
    started_at=started,
    status=ToolCallStatus.SUCCESS,
))

Multi-agent tracing

Zero-boilerplate (v0.12.0 — recommended)

If your agents use tool names matching the handoff pattern (call_<agent>, delegate_<agent>, etc.), no explicit handoff code is needed:

import langsight
langsight.auto_patch()

async def orchestrator(question: str):
    async with langsight.session(agent_name="orchestrator"):
        # LLM + MCP calls auto-traced. Handoff auto-emitted when LLM
        # selects "call_research_agent" as a tool.
        await jira_mcp.call_tool("get_issue", {"id": "TASK-42"})
        response = await llm.generate(question, tools=[call_research_agent_tool])

async def research_agent(question: str):
    async with langsight.session(agent_name="research_agent"):
        # MCP auto-traced, attributed to "research_agent"
        return await confluence_mcp.call_tool("search", {"query": question})

The dashboard shows orchestrator → research_agent as a solid edge because the tool name call_research_agent triggers auto-detection.

With explicit handoff helpers

When you need precise control over handoff timing and parent linking, use create_handoff() + wrap_child_agent():

import langsight
from langsight.sdk import LangSightClient

ls = LangSightClient(url="http://localhost:8000")
langsight.auto_patch()

async def orchestrator_agent():
    async with langsight.session(agent_name="orchestrator") as session_id:
        # MCP call: auto-traced (no wrap needed)
        await jira_mcp.call_tool("get_issue", {"id": "TASK-42"})

        # Explicit handoff — use when you need the handoff.span_id for
        # parent linking (e.g. framework doesn't use handoff-named tools)
        handoff = ls.create_handoff(
            "orchestrator", "research-agent",
            session_id=session_id,
        )

        # wrap_child_agent pre-sets parent_span_id and agent_name from the handoff
        traced_confluence = ls.wrap_child_agent(
            confluence_session, "confluence-mcp", "research-agent", handoff,
        )
        await traced_confluence.call_tool("search", {"query": "returns policy"})

Advanced: manual pattern (full control, no session() context manager)

Use this pattern only when you cannot use session() context manager (e.g. non-async code, or when session lifecycle is managed externally). Session IDs should come from a caller-controlled source, not uuid4 generation inline:

from datetime import UTC, datetime
from langsight.sdk import LangSightClient
from langsight.sdk.models import ToolCallSpan

ls = LangSightClient(url="http://localhost:8000")

async def orchestrator_agent():
    # session_id and trace_id from your own session management layer
    session_id = "sess-f2a9b1"
    trace_id = "trace-abc123"

    traced_jira = ls.wrap(
        jira_session,
        server_name="jira-mcp",
        agent_name="orchestrator",
        session_id=session_id,
        trace_id=trace_id,
    )
    await traced_jira.call_tool("get_issue", {"id": "TASK-42"})

    handoff = ToolCallSpan.handoff_span(
        from_agent="orchestrator",
        to_agent="research-agent",
        started_at=datetime.now(UTC),
        session_id=session_id,
        trace_id=trace_id,
    )
    ls.buffer_span(handoff)

    await research_agent(
        session_id=session_id,
        trace_id=trace_id,
        parent_span_id=handoff.span_id,
    )

async def research_agent(session_id: str, trace_id: str, parent_span_id: str):
    traced_confluence = ls.wrap(
        confluence_session,
        server_name="confluence-mcp",
        agent_name="research-agent",
        session_id=session_id,
        trace_id=trace_id,
        parent_span_id=parent_span_id,
    )
    await traced_confluence.call_tool("search", {"query": "returns policy"})

How parent_span_id works

parent_span_id uses the same model as OpenTelemetry distributed tracing. Each span has a unique span_id. When a child agent sets parent_span_id, LangSight can reconstruct the full call tree by following parent-child relationships from the flat span storage. No separate tree structure is required — tree reconstruction is a recursive query at read time.

Auto parent linking (LangChain/LangGraph)

When using the LangChain callback in auto-detect mode, parent linking happens automatically. The callback maintains a thread-local tool stack so that when a supervisor tool calls ainvoke() on a sub-agent, the sub-agent’s spans are linked to the parent tool call without any manual parent_span_id wiring. See the LangChain integration for details.

Silent MCP error detection

The MCP SDK returns errors as JSON-RPC responses (isError=True on the result object) instead of raising Python exceptions. Without LangSight, these errors pass silently through your agent code — the agent sees a “successful” call that actually returned an error. LangSight’s wrap() proxy detects result.isError and marks the span as status=error:

traced = langsight.wrap(session, server_name="postgres-mcp")

# MCP server returns an error response (e.g., invalid SQL)
result = await traced.call_tool("query", {"sql": "SELEC * FROM"})
# result.isError == True
# LangSight records: status=error, error="MCP error response"

This means:

Silent MCP errors show up as red in the dashboard trace view
Session health tags correctly reflect failures (e.g., tool_failure)
Alert rules fire on error rate thresholds
Circuit breakers count silent errors toward their failure threshold

No configuration needed — silent error detection is always on.

Direct LLM SDK tracing (wrap_llm)

After auto_patch(), LLM clients are traced without wrap_llm() — you only need wrap_llm() when using init() without auto_patch(), or when you want per-client control over agent_name/session_id outside a session() context.

# With auto_patch() — no wrap_llm() needed:
langsight.auto_patch()
async with langsight.session(agent_name="my-agent") as session_id:
    client = OpenAI()  # auto-traced
    response = client.chat.completions.create(model="gpt-4o", tools=[...], messages=[...])

# Without auto_patch() — explicit wrap_llm():
ls = LangSightClient(url="http://localhost:8000")
client = ls.wrap_llm(OpenAI(), agent_name="my-agent", session_id="sess-001")
response = client.chat.completions.create(model="gpt-4o", tools=[...], messages=[...])
# LLM generation span + tool_use spans + handoff auto-detection — all on

wrap_llm() accepts agent_name, session_id, and trace_id — all optional when inside a session() context (inherited from contextvars). See the Direct SDK integration page for full examples.

Prevention Guardrails

LangSight v0.3 adds a prevention layer that stops runaway agents before they waste tokens or cascade failures.

Loop Detection

Detects when an agent calls the same tool with the same arguments repeatedly.

client = LangSightClient(
    url="http://localhost:8000",
    loop_detection=True,     # enable loop detection
    loop_threshold=3,        # same tool+args 3 times = loop (default: 3)
    loop_action="terminate", # "terminate" (default) or "warn"
)

Raises LoopDetectedError when a loop is detected (with action="terminate"). Detection patterns: repetition (same call N times), ping-pong (A→B→A→B→A), retry-without-progress (same error repeated).

Budget Guardrails

Prevents sessions from exceeding cost, step count, or time budgets.

client = LangSightClient(
    url="http://localhost:8000",
    max_steps=25,            # hard stop at 25 tool calls
    max_cost_usd=1.00,       # hard stop at $1 (requires pricing_table)
    max_wall_time_s=120,     # hard stop at 2 minutes
    budget_soft_alert=0.80,  # log warning at 80% of each limit
    pricing_table={          # optional: model_id → (input_per_1m, output_per_1m)
        "claude-sonnet-4-6": (3.00, 15.00),
        "gpt-4o": (2.50, 10.00),
    },
)

Raises BudgetExceededError when a limit is hit.

Circuit Breaker

Automatically disables a failing MCP server after N consecutive failures.

client = LangSightClient(
    url="http://localhost:8000",
    circuit_breaker=True,            # enable circuit breaker
    circuit_breaker_threshold=5,     # open after 5 consecutive failures (default)
    circuit_breaker_cooldown=60.0,   # disable for 60 seconds (default)
    circuit_breaker_half_open_max=2, # test with 2 calls before closing (default)
)

Raises CircuitBreakerOpenError when a tool call is blocked. Per-server configuration in .langsight.yaml overrides SDK defaults:

servers:
  - name: postgres-mcp
    transport: stdio
    command: python server.py
    circuit_breaker:
      failure_threshold: 3
      cooldown_seconds: 30

Handling Prevention Errors

from langsight.exceptions import LoopDetectedError, BudgetExceededError, CircuitBreakerOpenError

try:
    result = await traced.call_tool("query", {"sql": "SELECT 1"})
except LoopDetectedError as e:
    print(f"Loop stopped: {e.tool_name} repeated {e.loop_count} times ({e.pattern})")
except BudgetExceededError as e:
    print(f"Budget exceeded: {e.limit_type} = {e.actual_value} (limit: {e.limit_value})")
except CircuitBreakerOpenError as e:
    print(f"Server disabled: {e.server_name}, cooldown: {e.cooldown_remaining_s:.0f}s")

Prevented calls are still recorded as spans with status="prevented" — visible in the dashboard trace view.

Server-Managed Configuration

Constructor params are offline fallbacks. When a LangSight backend is configured, thresholds are managed from the dashboard and applied automatically on each wrap() call.

# Constructor params = offline defaults (used when API unreachable)
client = LangSightClient(
    url="http://localhost:8000",
    loop_detection=True,
    loop_threshold=3,      # overridden by dashboard config if set
    max_steps=25,          # overridden by dashboard config if set
    circuit_breaker=True,
)

# On client.wrap(), thresholds are fetched from the dashboard and applied
proxy = client.wrap(mcp, agent_name="orchestrator")
# orchestrator's dashboard config (if set) overrides constructor params

wrap() returns immediately — the remote config fetch runs as a background task and takes effect before the first tool call. If the API is unreachable, constructor params remain active. Configure thresholds at Settings → Prevention in the dashboard, or via the API:

curl -X PUT http://localhost:8000/api/agents/orchestrator/prevention-config \
  -H "X-API-Key: ls_your_key" \
  -H "Content-Type: application/json" \
  -d '{"loop_threshold": 5, "max_steps": 30, "max_cost_usd": 1.00}'

Set agent_name to "*" for a project-level default that applies to all agents without a specific config entry.

Getting Started

CLI Reference

AI Providers

SDK & Integrations

Guides

MCP Monitoring

Agents

Reliability Features

Teams & Access

Self-Hosting

Documentation Index

​Installation

​The simplest integration (v0.13.0)

​What auto_patch() covers

​Environment variables

​Multi-agent without any boilerplate

​Handoff auto-detection rules

​Context inheritance

​Shared proxies and bridges (v0.13.1 fix)

​Capturing user input and agent output

​Level 1: Single-turn — capture question and answer

​Level 2: Human-in-the-loop — mid-session human input

​Level 3: Multi-turn conversations — link sessions with trace_id

​SessionContext reference

​No breaking changes

​Coexistence with Langfuse

​Upgrading from 0.11.x

​Quick start with init() + explicit wrapping

​Manual MCP wrapping

​Full example

​How it works

​Span metadata

​PII redaction

​LLM token and cost fields

​Anthropic prompt cache tokens (v0.14.7)

​latency_ms auto-computation

​Manual spans

​Multi-agent tracing

​Zero-boilerplate (v0.12.0 — recommended)

​With explicit handoff helpers

​Advanced: manual pattern (full control, no session() context manager)

​How parent_span_id works

​Auto parent linking (LangChain/LangGraph)

​Silent MCP error detection

​Direct LLM SDK tracing (wrap_llm)

​Prevention Guardrails

​Loop Detection

​Budget Guardrails

​Circuit Breaker

​Handling Prevention Errors

​Server-Managed Configuration

Installation

The simplest integration (v0.13.0)

What auto_patch() covers

Environment variables

Multi-agent without any boilerplate

Handoff auto-detection rules

Context inheritance

Shared proxies and bridges (v0.13.1 fix)

Capturing user input and agent output

Level 1: Single-turn — capture question and answer

Level 2: Human-in-the-loop — mid-session human input

Level 3: Multi-turn conversations — link sessions with trace_id

SessionContext reference

No breaking changes

Coexistence with Langfuse

Upgrading from 0.11.x

Quick start with `init()` + explicit wrapping

Manual MCP wrapping

Full example

How it works

Span metadata

PII redaction

LLM token and cost fields

Anthropic prompt cache tokens (v0.14.7)

`latency_ms` auto-computation

Manual spans

Multi-agent tracing

Zero-boilerplate (v0.12.0 — recommended)

With explicit handoff helpers

Advanced: manual pattern (full control, no session() context manager)

How parent_span_id works

Auto parent linking (LangChain/LangGraph)

Silent MCP error detection

Direct LLM SDK tracing (wrap_llm)

Prevention Guardrails

Loop Detection

Budget Guardrails

Circuit Breaker

Handling Prevention Errors

Server-Managed Configuration