Agents - LangSight

Overview

The Agents page at /agents is a catalog of every agent name that has appeared in at least one span. It uses a three-state adaptive layout that expands from a summary table into a full topology view.

Three-state layout

State 1 — No agent selected

A full-width sortable table. Columns:

Column	Description
Agent	Agent name from the `agent_name` span attribute
Status	Reliability status dot — see Status thresholds below
Health	Session health score — see Health score below
Sessions	Total session count in the selected time window
Errors	Failed tool calls across all sessions
Avg duration	Mean session wall-clock time
Tokens/Session	Average total tokens consumed per session — see Token Efficiency below
Loops	Count of loop detection events — see Loop Detection Count below
Last active	Time since the most recent span arrived

Click any row to open the agent detail panel (State 2).

State 2 — Agent selected

A 280px left sidebar lists all agents with their status dot and health score. The main panel shows the selected agent’s details in five tabs: About, Overview, Sessions, Topology, and SLOs.

State 3 — Topology full-width

Click Topology in the detail tabs to expand into full-width mode, which shows the agent’s call graph across all sessions: which MCP servers it connects to, call volumes per edge, and error counts per server.

Health score

The health score is the single most important reliability number per agent. It answers: “In this time window, what fraction of sessions did this agent complete successfully?”

Calculation

health_score = (sessions with health tag "success" or "success_with_fallback")
               ─────────────────────────────────────────────────────────────
                         total sessions in the time window

Only success and success_with_fallback count as healthy. All other health tags — tool_failure, timeout, loop_detected, budget_exceeded, circuit_breaker_open, schema_drift, incomplete — count as unhealthy. For details on what each health tag means, see Session Health.

Display

The Health column shows a percentage with a colour-coded progress bar and a sub-label showing X/Y sessions (healthy sessions / total sessions in the window):

Green bar — health score >= 90%
Amber bar — health score 70–89%
Red bar — health score < 70%

The column is sortable — click the column header to rank agents by health score.

Why health score instead of error rate?

Error rate (failed tool calls / total tool calls) is a noisy signal. A single session where the agent retried a tool call and recovered inflates the error rate even though the mission succeeded. Health score measures at the session level: did the agent complete its objective or not?

Scenario	Error rate	Health score
1 retry, agent recovered and succeeded	> 0%	100%
Consistent single-attempt success	0%	100%
Agent crashed halfway through	100% for that session	Lower (< 100%)

Use error rate (on the Dashboard Tools tab) to investigate which specific tools are unreliable. Use health score to answer the business question: “Is this agent working?”

Status thresholds

The status dot next to each agent name reflects the current health score:

Dot	Label	Health score
Green	healthy	>= 90%
Amber	degraded	70–89%
Red	failing	< 70%

The status dot updates whenever the health score changes — either because new sessions arrive or the selected time window changes.

An agent with zero sessions in the current time window shows a grey dot with no label. This means the agent has not run recently — it does not indicate a failure.

SLO badges

Each agent row in the table shows a small badge next to the agent name indicating whether that agent is currently meeting its defined SLOs.

Badge	Colour	Meaning
`SLO ✓`	Green	All SLOs for this agent are passing
`SLO ✗`	Red	At least one SLO for this agent is breached
(no badge)	—	No SLOs have been defined for this agent yet

Status resolution when multiple SLOs exist

When an agent has more than one SLO (for example, both a success_rate SLO and a latency_p99 SLO), the badge reflects the worst status across all of them. Precedence: breached > no_data > ok. A single breach turns the badge red regardless of how many other SLOs are passing.

Refresh interval

The SLO badge refreshes every 2 minutes. It does not update in real time as spans arrive — it reflects the last completed SLO evaluation cycle.

How to use the badge

The SLO badge is the fastest way to know that a reliability target is being missed without opening the SLOs page. Scan the Agents table: any red SLO ✗ badge means that agent is out of spec right now. Clicking the agent row opens the detail panel. Switch to the SLOs tab to see exactly which SLO is breached, by how much, and to add or remove SLOs for that agent. For background on SLO concepts, measurement windows, and the API, see Agent SLOs.

Configuring SLOs

SLOs for an agent are created and managed directly inside the agent detail panel. Click any agent row to open the panel, then select the SLOs tab.

SLOs tab layout

The tab has two parts: the list of existing SLOs for this agent, and the + Add SLO button. Existing SLO cards Each defined SLO is shown as a card with:

Field	Description
Metric	`Success Rate` or `Latency p99`
Status	Passing (green), Breached (red), or No data (grey)
Target	The threshold you set — e.g. `95%` or `2000ms`
Current	The agent’s actual metric value right now
Window	The evaluation period — e.g. `24h`
Delete	Removes the SLO immediately

When no SLOs are defined, the tab shows a dashed empty state with a prompt to add the first one. + Add SLO form Clicking + Add SLO opens an inline form in the tab. Fields:

Field	Options	Notes
Metric	Success Rate / Latency p99	Toggle — select one
Target	Number	Percentage (0–100) for Success Rate; milliseconds for Latency p99
Window	1h, 6h, 24h, 7d	Evaluation window for this SLO

After you click Create SLO, the tab refreshes immediately and the new card appears with its evaluated status.

Metric definitions

Metric	What it measures	Good starting target
Success Rate	% of sessions in the window tagged `success` or `success_with_fallback`	95% for production agents
Latency p99	99th percentile session duration in milliseconds	5000ms for interactive agents

The success_rate metric used in SLOs is based on session health tags (success / success_with_fallback), not raw tool call error rate. A session where the agent retried a tool and recovered is still counted as successful. This matches the Health score calculation.

Choosing a good target

Success Rate Start at 90% and tighten as the agent matures. 95%+ is production-grade. Below 80% means the agent is failing more than 1 in 5 sessions — investigate before raising the target. Latency p99 Think about the user-facing timeout. If your users expect a response in 10 seconds, set p99 ≤ 8000ms to give yourself headroom before the user experience degrades. Set it tighter (≤ 3000ms) for low-latency interactive use cases. Window

24h — the right default for operational alerting. Catches today’s problems.
7d — better for trend and capacity planning. Smooths over one-off incidents.
1h / 6h — useful during active incidents to track recovery in near-real-time.

Why SLOs live in the agent detail panel

SLOs are reliability contracts for a specific agent. Placing them in the detail panel means you see the contract, the current performance, and the full session history in one place — no context switching to a separate configuration page. The SLO ✓ / SLO ✗ badge in the agent table reflects the worst SLO status across all SLOs for that agent. Clicking the badge row and opening the SLOs tab shows exactly which SLO is breached and by how much. For the full API reference (creating SLOs programmatically, bulk status checks), see Agent SLOs.

Time window

The health score and session count both respect the time window selected in the top-right corner of the page: 1h, 6h, 24h, or 7d. Changing the window immediately recalculates health scores and re-sorts the table. For SLO tracking across a fixed measurement period, see Agent SLOs.

Detail panel tabs

Overview tab

Summary cards for the selected agent:

Total sessions, healthy sessions, and health score in the current window
Avg/p99 session duration
Total token usage and estimated cost
Most-used MCP servers (top 5 by call count)
Most-called tools (top 5 by call count)

Sessions tab

Filterable list of sessions for this agent in the current time window. Shows health tag, call count, error count, duration, and cost per session. Click any row to open the full session detail page with lineage graph and span trace.

Tools tab

Per-tool reliability breakdown for this agent: call count, error rate, p99 latency, and calls-per-session. See Dashboard — Calls per Session for how calls-per-session is calculated and interpreted.

Topology tab

Full-width call graph showing every MCP server this agent has connected to. Edges are weighted by call volume. Click an edge to see the per-tool breakdown. Click a server node to jump to that server’s entry in the MCP Servers catalog.

Servers tab

Shows every MCP server (and sub-agent) this agent called in the selected time window. Added in v0.8.6. Layout: Each server is displayed as a card in a list. Click any card to expand it and see the individual tools the agent called on that server. Per-server card (collapsed):

Field	Description
Server name	Name as it appears in health check config or traces
Badge	MCP Server (indigo) or Sub-agent (grey) — see badge types below
Tool count	Number of distinct tools called on this server in the time window
Total calls	Total tool invocations from this agent to this server
Errors	Failed calls
Health status	Current health check status (`up`, `degraded`, `down`) — only shown for MCP Servers

Per-tool breakdown (expanded):

Column	Description
Tool name	Tool as called by this agent
Call count	Times this agent called this tool
Error count	Failed calls
Avg latency	Mean latency for this agent’s calls to this tool
Success rate	Visual progress bar showing the success percentage

Badge types:

Badge	Colour	Meaning
MCP Server	Indigo	This server appears in health check config — it is actively monitored
Sub-agent	Grey	This server appears only in traces, not in health check config

A server with a Sub-agent badge is still tracked for call volume and error counts, but LangSight does not run health checks against it. To start monitoring it, add it to .langsight.yaml with langsight add. Matching across names: An agent may call a server as catalog in traces while the health check config registers it as catalog-mcp. LangSight matches these by stripping the -mcp suffix. Both may appear in the Servers tab separately — catalog as Sub-agent and catalog-mcp as MCP Server — until you reconcile the names.

SLOs tab

Shows all SLOs defined for this agent and lets you create or delete them without leaving the page. See Configuring SLOs for the full walkthrough.

Token Efficiency

The Tokens/Session column in the agents table shows the average total tokens (input + output, across all LLM calls) consumed per session for each agent.

Display

Large numbers are abbreviated: 2,400 → 2.4k, 18,000 → 18k. A sub-label avg/session appears below the number to distinguish it from a session total.

Why it matters

Tokens/Session is the per-session driver behind the Cost column. Where Cost tells you what you spent, Tokens/Session tells you why. Use it to:

Identify expensive agents — which agent is consuming the most tokens per run, and why?
Detect prompt bloat — if an agent’s Tokens/Session is growing over time, its system prompt or accumulated context is growing. This is a leading indicator for eventual MAX_TOKENS failures (see Context Window Pressure).
Compare agents directly — orchestrator: 2,400 tokens/session vs analyst: 890 tokens/session means the orchestrator costs roughly 3× more per run. Is that justified by its task complexity?

Relationship with Cost column

Cost is estimated from model pricing and depends on both token count and model. Two agents with similar Tokens/Session can have very different costs if they use different models. Tokens/Session isolates the usage volume from the pricing effect.

Loop Detection Count

The Loops column in the agents table shows a red badge with the count of loop detection events for each agent in the selected time window. A loop detection event occurs when LangSight’s SDK detects a repeating tool-call pattern and prevents the call from being made. The prevented call is recorded with status=prevented and the session is tagged loop_detected.

What counts as a loop

LangSight detects three patterns before each tool call:

Pattern	Description
Repetition	Same tool called with identical arguments N times in a row (default threshold: 3)
Ping-pong	Alternating between two tool+argument pairs repeatedly (A→B→A→B→A)
Retry without progress	Same tool failing with the same error repeatedly

When any of these patterns is detected, the call is blocked and the Loops counter increments.

Display

Red badge with count — one or more loop events in the time window
— — no loops detected (clean)

The badge is always red. A loop is always a reliability problem worth investigating — even if the agent ultimately recovered (tagged success_with_fallback), each prevented loop call represents wasted tool calls and unnecessary cost.

What to do when you see loops

Click the agent row to open the detail panel
Go to the Sessions tab and filter by the loop_detected health tag
Click any affected session to open the full trace
In the trace, look for the span with status=prevented — the span detail shows which tool was looping and the exact arguments that were repeated
Fix the agent logic: add a guard condition, change the arguments passed on retry, or add a fallback path that breaks the cycle

For the full definition of the loop_detected health tag and how it composes with other tags, see Session Health.

API reference

Method	Endpoint	Description
`GET`	`/api/agents`	List all known agent names with session counts and health scores
`GET`	`/api/agents/{name}/sessions`	Sessions for a specific agent (supports `window`, `health_tag`, `limit` query params)
`GET`	`/api/agents/{name}/health`	Health score and status for an agent in a given window
`GET`	`/api/agents/{name}/servers`	MCP servers and sub-agents called by this agent (supports `window` query param)

Documentation Index

​Overview

​Three-state layout

​State 1 — No agent selected

​State 2 — Agent selected

​State 3 — Topology full-width

​Health score

​Calculation

​Display

​Why health score instead of error rate?

​Status thresholds

​SLO badges

​Status resolution when multiple SLOs exist

​Refresh interval

​How to use the badge

​Configuring SLOs

​SLOs tab layout

​Metric definitions

​Choosing a good target

​Why SLOs live in the agent detail panel

​Time window

​Detail panel tabs

​Overview tab

​Sessions tab

​Tools tab

​Topology tab

​Servers tab

​SLOs tab

​Token Efficiency

​Display

​Why it matters

​Relationship with Cost column

​Loop Detection Count

​What counts as a loop

​Display

​What to do when you see loops

​API reference

Overview

Three-state layout

State 1 — No agent selected

State 2 — Agent selected

State 3 — Topology full-width

Health score

Calculation

Display

Why health score instead of error rate?

Status thresholds

SLO badges

Status resolution when multiple SLOs exist

Refresh interval

How to use the badge

Configuring SLOs

SLOs tab layout

Metric definitions

Choosing a good target

Why SLOs live in the agent detail panel

Time window

Detail panel tabs

Overview tab

Sessions tab

Tools tab

Topology tab

Servers tab

SLOs tab

Token Efficiency

Display

Why it matters

Relationship with Cost column

Loop Detection Count

What counts as a loop

Display

What to do when you see loops

API reference