Overview
The Agents page at /agents is a catalog of every agent name that has appeared in at least one span. It uses a three-state adaptive layout that expands from a summary table into a full topology view.
Three-state layout
State 1 — No agent selected
A full-width sortable table. Columns:
| Column | Description |
|---|
| Agent | Agent name from the agent_name span attribute |
| Status | Reliability status dot — see Status thresholds below |
| Health | Session health score — see Health score below |
| Sessions | Total session count in the selected time window |
| Errors | Failed tool calls across all sessions |
| Avg duration | Mean session wall-clock time |
| Tokens/Session | Average total tokens consumed per session — see Token Efficiency below |
| Loops | Count of loop detection events — see Loop Detection Count below |
| Last active | Time since the most recent span arrived |
Click any row to open the agent detail panel (State 2).
State 2 — Agent selected
A 280px left sidebar lists all agents with their status dot and health score. The main panel shows the selected agent’s details in five tabs: About, Overview, Sessions, Topology, and SLOs.
State 3 — Topology full-width
Click Topology in the detail tabs to expand into full-width mode, which shows the agent’s call graph across all sessions: which MCP servers it connects to, call volumes per edge, and error counts per server.
Health score
The health score is the single most important reliability number per agent. It answers: “In this time window, what fraction of sessions did this agent complete successfully?”
Calculation
health_score = (sessions with health tag "success" or "success_with_fallback")
─────────────────────────────────────────────────────────────
total sessions in the time window
Only success and success_with_fallback count as healthy. All other health tags — tool_failure, timeout, loop_detected, budget_exceeded, circuit_breaker_open, schema_drift, incomplete — count as unhealthy.
For details on what each health tag means, see Session Health.
Display
The Health column shows a percentage with a colour-coded progress bar and a sub-label showing X/Y sessions (healthy sessions / total sessions in the window):
- Green bar — health score >= 90%
- Amber bar — health score 70–89%
- Red bar — health score < 70%
The column is sortable — click the column header to rank agents by health score.
Why health score instead of error rate?
Error rate (failed tool calls / total tool calls) is a noisy signal. A single session where the agent retried a tool call and recovered inflates the error rate even though the mission succeeded. Health score measures at the session level: did the agent complete its objective or not?
| Scenario | Error rate | Health score |
|---|
| 1 retry, agent recovered and succeeded | > 0% | 100% |
| Consistent single-attempt success | 0% | 100% |
| Agent crashed halfway through | 100% for that session | Lower (< 100%) |
Use error rate (on the Dashboard Tools tab) to investigate which specific tools are unreliable. Use health score to answer the business question: “Is this agent working?”
Status thresholds
The status dot next to each agent name reflects the current health score:
| Dot | Label | Health score |
|---|
| Green | healthy | >= 90% |
| Amber | degraded | 70–89% |
| Red | failing | < 70% |
The status dot updates whenever the health score changes — either because new sessions arrive or the selected time window changes.
An agent with zero sessions in the current time window shows a grey dot with no label. This means the agent has not run recently — it does not indicate a failure.
SLO badges
Each agent row in the table shows a small badge next to the agent name indicating whether that agent is currently meeting its defined SLOs.
| Badge | Colour | Meaning |
|---|
SLO ✓ | Green | All SLOs for this agent are passing |
SLO ✗ | Red | At least one SLO for this agent is breached |
| (no badge) | — | No SLOs have been defined for this agent yet |
Status resolution when multiple SLOs exist
When an agent has more than one SLO (for example, both a success_rate SLO and a latency_p99 SLO), the badge reflects the worst status across all of them. Precedence: breached > no_data > ok. A single breach turns the badge red regardless of how many other SLOs are passing.
Refresh interval
The SLO badge refreshes every 2 minutes. It does not update in real time as spans arrive — it reflects the last completed SLO evaluation cycle.
How to use the badge
The SLO badge is the fastest way to know that a reliability target is being missed without opening the SLOs page. Scan the Agents table: any red SLO ✗ badge means that agent is out of spec right now.
Clicking the agent row opens the detail panel. Switch to the SLOs tab to see exactly which SLO is breached, by how much, and to add or remove SLOs for that agent.
For background on SLO concepts, measurement windows, and the API, see Agent SLOs.
Configuring SLOs
SLOs for an agent are created and managed directly inside the agent detail panel. Click any agent row to open the panel, then select the SLOs tab.
SLOs tab layout
The tab has two parts: the list of existing SLOs for this agent, and the + Add SLO button.
Existing SLO cards
Each defined SLO is shown as a card with:
| Field | Description |
|---|
| Metric | Success Rate or Latency p99 |
| Status | Passing (green), Breached (red), or No data (grey) |
| Target | The threshold you set — e.g. 95% or 2000ms |
| Current | The agent’s actual metric value right now |
| Window | The evaluation period — e.g. 24h |
| Delete | Removes the SLO immediately |
When no SLOs are defined, the tab shows a dashed empty state with a prompt to add the first one.
+ Add SLO form
Clicking + Add SLO opens an inline form in the tab. Fields:
| Field | Options | Notes |
|---|
| Metric | Success Rate / Latency p99 | Toggle — select one |
| Target | Number | Percentage (0–100) for Success Rate; milliseconds for Latency p99 |
| Window | 1h, 6h, 24h, 7d | Evaluation window for this SLO |
After you click Create SLO, the tab refreshes immediately and the new card appears with its evaluated status.
Metric definitions
| Metric | What it measures | Good starting target |
|---|
| Success Rate | % of sessions in the window tagged success or success_with_fallback | 95% for production agents |
| Latency p99 | 99th percentile session duration in milliseconds | 5000ms for interactive agents |
The success_rate metric used in SLOs is based on session health tags (success / success_with_fallback), not raw tool call error rate. A session where the agent retried a tool and recovered is still counted as successful. This matches the Health score calculation.
Choosing a good target
Success Rate
Start at 90% and tighten as the agent matures. 95%+ is production-grade. Below 80% means the agent is failing more than 1 in 5 sessions — investigate before raising the target.
Latency p99
Think about the user-facing timeout. If your users expect a response in 10 seconds, set p99 ≤ 8000ms to give yourself headroom before the user experience degrades. Set it tighter (≤ 3000ms) for low-latency interactive use cases.
Window
- 24h — the right default for operational alerting. Catches today’s problems.
- 7d — better for trend and capacity planning. Smooths over one-off incidents.
- 1h / 6h — useful during active incidents to track recovery in near-real-time.
Why SLOs live in the agent detail panel
SLOs are reliability contracts for a specific agent. Placing them in the detail panel means you see the contract, the current performance, and the full session history in one place — no context switching to a separate configuration page.
The SLO ✓ / SLO ✗ badge in the agent table reflects the worst SLO status across all SLOs for that agent. Clicking the badge row and opening the SLOs tab shows exactly which SLO is breached and by how much.
For the full API reference (creating SLOs programmatically, bulk status checks), see Agent SLOs.
Time window
The health score and session count both respect the time window selected in the top-right corner of the page: 1h, 6h, 24h, or 7d. Changing the window immediately recalculates health scores and re-sorts the table.
For SLO tracking across a fixed measurement period, see Agent SLOs.
Detail panel tabs
Overview tab
Summary cards for the selected agent:
- Total sessions, healthy sessions, and health score in the current window
- Avg/p99 session duration
- Total token usage and estimated cost
- Most-used MCP servers (top 5 by call count)
- Most-called tools (top 5 by call count)
Sessions tab
Filterable list of sessions for this agent in the current time window. Shows health tag, call count, error count, duration, and cost per session. Click any row to open the full session detail page with lineage graph and span trace.
Per-tool reliability breakdown for this agent: call count, error rate, p99 latency, and calls-per-session. See Dashboard — Calls per Session for how calls-per-session is calculated and interpreted.
Topology tab
Full-width call graph showing every MCP server this agent has connected to. Edges are weighted by call volume. Click an edge to see the per-tool breakdown. Click a server node to jump to that server’s entry in the MCP Servers catalog.
Servers tab
Shows every MCP server (and sub-agent) this agent called in the selected time window. Added in v0.8.6.
Layout: Each server is displayed as a card in a list. Click any card to expand it and see the individual tools the agent called on that server.
Per-server card (collapsed):
| Field | Description |
|---|
| Server name | Name as it appears in health check config or traces |
| Badge | MCP Server (indigo) or Sub-agent (grey) — see badge types below |
| Tool count | Number of distinct tools called on this server in the time window |
| Total calls | Total tool invocations from this agent to this server |
| Errors | Failed calls |
| Health status | Current health check status (up, degraded, down) — only shown for MCP Servers |
Per-tool breakdown (expanded):
| Column | Description |
|---|
| Tool name | Tool as called by this agent |
| Call count | Times this agent called this tool |
| Error count | Failed calls |
| Avg latency | Mean latency for this agent’s calls to this tool |
| Success rate | Visual progress bar showing the success percentage |
Badge types:
| Badge | Colour | Meaning |
|---|
| MCP Server | Indigo | This server appears in health check config — it is actively monitored |
| Sub-agent | Grey | This server appears only in traces, not in health check config |
A server with a Sub-agent badge is still tracked for call volume and error counts, but LangSight does not run health checks against it. To start monitoring it, add it to .langsight.yaml with langsight add.
Matching across names: An agent may call a server as catalog in traces while the health check config registers it as catalog-mcp. LangSight matches these by stripping the -mcp suffix. Both may appear in the Servers tab separately — catalog as Sub-agent and catalog-mcp as MCP Server — until you reconcile the names.
SLOs tab
Shows all SLOs defined for this agent and lets you create or delete them without leaving the page. See Configuring SLOs for the full walkthrough.
Token Efficiency
The Tokens/Session column in the agents table shows the average total tokens (input + output, across all LLM calls) consumed per session for each agent.
Display
Large numbers are abbreviated: 2,400 → 2.4k, 18,000 → 18k. A sub-label avg/session appears below the number to distinguish it from a session total.
Why it matters
Tokens/Session is the per-session driver behind the Cost column. Where Cost tells you what you spent, Tokens/Session tells you why.
Use it to:
- Identify expensive agents — which agent is consuming the most tokens per run, and why?
- Detect prompt bloat — if an agent’s Tokens/Session is growing over time, its system prompt or accumulated context is growing. This is a leading indicator for eventual
MAX_TOKENS failures (see Context Window Pressure).
- Compare agents directly —
orchestrator: 2,400 tokens/session vs analyst: 890 tokens/session means the orchestrator costs roughly 3× more per run. Is that justified by its task complexity?
Relationship with Cost column
Cost is estimated from model pricing and depends on both token count and model. Two agents with similar Tokens/Session can have very different costs if they use different models. Tokens/Session isolates the usage volume from the pricing effect.
Loop Detection Count
The Loops column in the agents table shows a red badge with the count of loop detection events for each agent in the selected time window.
A loop detection event occurs when LangSight’s SDK detects a repeating tool-call pattern and prevents the call from being made. The prevented call is recorded with status=prevented and the session is tagged loop_detected.
What counts as a loop
LangSight detects three patterns before each tool call:
| Pattern | Description |
|---|
| Repetition | Same tool called with identical arguments N times in a row (default threshold: 3) |
| Ping-pong | Alternating between two tool+argument pairs repeatedly (A→B→A→B→A) |
| Retry without progress | Same tool failing with the same error repeatedly |
When any of these patterns is detected, the call is blocked and the Loops counter increments.
Display
- Red badge with count — one or more loop events in the time window
— — no loops detected (clean)
The badge is always red. A loop is always a reliability problem worth investigating — even if the agent ultimately recovered (tagged success_with_fallback), each prevented loop call represents wasted tool calls and unnecessary cost.
What to do when you see loops
- Click the agent row to open the detail panel
- Go to the Sessions tab and filter by the
loop_detected health tag
- Click any affected session to open the full trace
- In the trace, look for the span with
status=prevented — the span detail shows which tool was looping and the exact arguments that were repeated
- Fix the agent logic: add a guard condition, change the arguments passed on retry, or add a fallback path that breaks the cycle
For the full definition of the loop_detected health tag and how it composes with other tags, see Session Health.
API reference
| Method | Endpoint | Description |
|---|
GET | /api/agents | List all known agent names with session counts and health scores |
GET | /api/agents/{name}/sessions | Sessions for a specific agent (supports window, health_tag, limit query params) |
GET | /api/agents/{name}/health | Health score and status for an agent in a given window |
GET | /api/agents/{name}/servers | MCP servers and sub-agents called by this agent (supports window query param) |