Documentation Index
Fetch the complete documentation index at: https://docs.langsight.dev/llms.txt
Use this file to discover all available pages before exploring further.
Base URL
Start the API with langsight serve or docker compose up -d api.
Authentication
Two authentication methods are supported. Use whichever matches your client.
Method 1: X-API-Key (SDK and CLI)
Direct API access for the LangSight SDK, CLI, and any external client.
curl http://localhost:8000/api/health/servers \
-H "X-API-Key: ls_your_api_key_here"
API keys are created in the dashboard under Settings → API Keys.
The dashboard never calls FastAPI directly. All requests go through the Next.js proxy at /api/proxy/*, which reads the NextAuth session server-side and injects X-User-Id and X-User-Role headers.
FastAPI trusts X-User-Id / X-User-Role only from the configured proxy CIDRs in LANGSIGHT_TRUSTED_PROXY_CIDRS. By default this is loopback; Docker deployments typically also include internal container networks such as 172.16.0.0/12 and 10.0.0.0/8.
Do not set X-User-Id or X-User-Role in your own API clients. These are internal proxy headers. Use X-API-Key for all SDK and CLI access.
Auth disabled mode
Authentication is disabled only when running locally with no API keys configured (development mode). In all other cases, unauthenticated requests return 401.
Roles
| Role | Access |
|---|
admin | Full read/write on all endpoints |
viewer | Read-only — POST/PATCH/DELETE return 403 |
Rate limits
All rate limiting is enforced by a single shared Limiter instance (defined in src/langsight/api/rate_limit.py). Per-route overrides take precedence over the global default.
| Route | Limit |
|---|
POST /api/traces/spans | 2000 requests/min (high-frequency SDK ingestion) |
POST /api/traces/otlp | 60 requests/min (OTEL collector batches) |
POST /api/users/accept-invite | 5 requests/min (invite brute-force protection) |
POST /api/users/verify | 10 requests/min (login brute-force protection) |
| All other routes | 200 requests/min (global default) |
Interactive docs
Once the API is running:
Endpoints
| Method | Path | Description |
|---|
GET | /api/liveness | Instant liveness probe — no I/O, always fast |
GET | /api/readiness | Readiness probe — verifies storage connectivity |
GET | /api/status | API status (deprecated — use /api/liveness) |
GET | /metrics | Prometheus scrape endpoint — no auth required (see Monitoring) |
Health
| Method | Path | Description |
|---|
GET | /api/health/servers | Latest health for all configured servers |
GET | /api/health/servers/{name} | Latest health for one server |
GET | /api/health/servers/{name}/history | Health history (newest first) |
POST | /api/health/check | Trigger on-demand health check |
Security
| Method | Path | Description |
|---|
POST | /api/security/scan | Run security scan on all servers |
Traces
| Method | Path | Description |
|---|
POST | /api/traces/spans | Ingest ToolCallSpan JSON from the SDK |
POST | /api/traces/otlp | Ingest standard OTLP/JSON traces |
GET | /api/traces/spans | Query spans by server, tool, session, time range |
Agent Sessions
| Method | Path | Description |
|---|
GET | /api/agents/sessions | List agent sessions with aggregated cost, call count, failure count |
GET | /api/agents/sessions/{session_id} | Full span tree for one session with parent_span_id hierarchy |
Session object:
{
"session_id": "sess-f2a9b1",
"agent_name": "orchestrator",
"first_call_at": "2026-03-17T14:02:31Z",
"last_call_at": "2026-03-17T14:02:32Z",
"duration_ms": 1482,
"tool_calls": 5,
"failed_calls": 1,
"servers_used": ["postgres-mcp", "slack-mcp"],
"health_tag": "success"
}
health_tag is assigned automatically by the v0.3 prevention layer when the session ends. The 8 possible values are:| Value | Meaning |
|---|
success | All tool calls completed without issue |
success_with_fallback | Completed, but at least one circuit-breaker fallback was used |
loop_detected | Session was terminated because a loop pattern was detected |
budget_exceeded | Session was stopped because a cost, step, or time limit was hit |
tool_failure | One or more tool calls failed (but no loop or budget event) |
circuit_breaker_open | A tool call was blocked because its server’s circuit breaker was open |
timeout | The session exceeded the configured max_wall_time_s limit |
schema_drift | A tool schema changed mid-session, triggering a drift alert |
Prevention Config
Per-agent and per-project prevention thresholds managed from the dashboard or API. Constructor params in the SDK are offline fallbacks; these values take precedence when the SDK can reach the backend.
| Method | Path | Description |
|---|
GET | /api/agents/{name}/prevention-config | Read prevention config for one agent |
PUT | /api/agents/{name}/prevention-config | Create or update prevention config for one agent |
DELETE | /api/agents/{name}/prevention-config | Remove per-agent config (project default applies) |
GET | /api/projects/{project_id}/prevention-config | Read project-level default config (agent_name="*") |
PUT | /api/projects/{project_id}/prevention-config | Create or update project-level default config |
DELETE | /api/projects/{project_id}/prevention-config | Remove project-level default |
Prevention config object:
{
"id": "pvc-abc123",
"project_id": "my-project",
"agent_name": "orchestrator",
"loop_enabled": true,
"loop_threshold": 5,
"loop_action": "terminate",
"max_steps": 30,
"max_cost_usd": 1.00,
"max_wall_time_s": 120,
"budget_soft_alert": 0.80,
"cb_enabled": true,
"cb_failure_threshold": 5,
"cb_cooldown_seconds": 60,
"cb_half_open_max_calls": 2
}
Use agent_name: "*" for the project-level default that applies to all agents without a specific entry. All fields are optional on PUT — only supplied fields are updated.
Agent Topology
| Method | Path | Description |
|---|
GET | /api/agents/lineage | Aggregated lineage graph of agents, MCP servers, and handoffs |
| Method | Path | Description |
|---|
GET | /api/agents/metadata | List agent catalog metadata for the active project |
GET | /api/agents/metadata/{agent_name} | Read metadata for one agent |
PUT | /api/agents/metadata/{agent_name} | Create or update metadata for one agent |
DELETE | /api/agents/metadata/{agent_name} | Delete metadata for one agent |
Auto-Discovery
| Method | Path | Description |
|---|
POST | /api/agents/discover | Scan ClickHouse for distinct agent_name values and auto-register any missing from the catalog (admin only) |
POST | /api/servers/discover | Scan ClickHouse for distinct server_name values and auto-register any missing from the catalog (admin only) |
Discovery response:
{
"discovered": 3,
"agents": ["supervisor", "analyst", "billing-agent"]
}
Auto-discovery also happens on every POST /api/traces/spans call — unseen agent_name and server_name values are registered automatically. The /discover endpoints are for batch backfill of existing trace data. See Auto-Discovery for details.
Projects
| Method | Path | Description |
|---|
GET | /api/projects | List projects visible to the current user |
POST | /api/projects | Create a new project (admin only) |
GET | /api/projects/{project_id} | Get project details |
PATCH | /api/projects/{project_id} | Update project name/settings (admin only) |
DELETE | /api/projects/{project_id} | Delete project (admin only) |
Users
| Method | Path | Description |
|---|
GET | /api/users | List users (admin only) |
POST | /api/users/invite | Send an invitation link (admin only) |
PATCH | /api/users/{user_id}/role | Change user role (admin only) |
DELETE | /api/users/{user_id} | Deactivate user (admin only) |
SLOs
| Method | Path | Description |
|---|
GET | /api/slos | List all defined SLOs |
POST | /api/slos | Create a new SLO |
GET | /api/slos/status | Current pass/fail status for all SLOs |
PATCH | /api/slos/{slo_id} | Update an SLO definition |
DELETE | /api/slos/{slo_id} | Delete an SLO |
SLO object:
{
"slo_id": "slo-abc123",
"agent_name": "support-agent",
"project_id": "my-project",
"metric": "success_rate",
"threshold": 0.99,
"window_hours": 24
}
Supported metric values: success_rate, latency_p99.
Costs
| Method | Path | Description |
|---|
GET | /api/costs/breakdown | Cost breakdown by model and tool call (project-scoped) |
GET | /api/costs/models | List model pricing records |
POST | /api/costs/models | Add a custom model pricing record (admin only) |
PATCH | /api/costs/models/{model_id} | Update model pricing (admin only) |
DELETE | /api/costs/models/{model_id} | Remove a model pricing record (admin only) |
Query params for /api/costs/breakdown: project_id, window (24h, 7d, 30d).
Reliability
| Method | Path | Description |
|---|
GET | /api/reliability/anomalies | Anomalies detected via z-score vs 7-day baseline |
Alerts
| Method | Path | Description |
|---|
GET | /api/alerts/config | Read current Slack webhook URL and per-type alert preferences |
POST | /api/alerts/config | Save Slack webhook URL and alert type preferences |
POST | /api/alerts/test | Send a test Slack Block Kit message to the configured webhook |
Alert config object (request and response body for POST /api/alerts/config):
{
"slack_webhook": "https://hooks.slack.com/services/T.../B.../xxx",
"alert_types": {
"mcp_down": true,
"mcp_recovered": true,
"agent_failure": true,
"slo_breached": true,
"anomaly_critical": true,
"security_critical": true,
"loop_detected": true,
"budget_exceeded": true,
"circuit_breaker_open": true
}
}
POST /api/alerts/test returns {"ok": true} on success or {"ok": false, "error": "<message>"} if the webhook call fails.
Instance Settings
| Method | Path | Auth required | Description |
|---|
GET | /api/settings | API key | Read global instance settings (e.g., redact_payloads) |
PUT | /api/settings | API key + admin role | Update global instance settings |
Both /api/settings endpoints require authentication (added v0.6.2). GET requires a valid API key. PUT requires a valid API key and admin role. Unauthenticated requests return 401.
Settings object:
{
"redact_payloads": false
}
When redact_payloads is true, the server strips input_args, output_result, llm_input, and llm_output from all incoming spans before storage — overriding individual SDK settings. See Configuration for details.
Audit
| Method | Path | Description |
|---|
GET | /api/audit/logs | List recent audit log events (auth and RBAC actions) |
Query params: limit (default 50, max 200), offset (default 0).
Audit event object:
{
"event_id": "evt-abc123",
"timestamp": "2026-03-19T10:42:00Z",
"actor": "[email protected]",
"action": "api_key.created",
"resource": "key:ls_xxxxxxxx",
"result": "success"
}
Captured actions: user.login, user.login_failed, api_key.created, api_key.revoked, user.role_changed, user.invited, user.deactivated, project.created, project.deleted, alerts.config_saved, model_pricing.updated.
The audit log is persisted to the audit_logs PostgreSQL table. Writes are asynchronous (via asyncio.create_task) and never block the request path. Email addresses are masked (e.g., a***@example.com) before storage.
Live Events (SSE)
| Method | Path | Description |
|---|
GET | /api/live/events | Server-Sent Events stream for real-time dashboard updates |
The /api/live/events endpoint requires authentication (same as all other API routes). It streams events as they occur:
span:new — fired when a new tool call span is ingested. Payload includes project_id — clients should filter events by project_id to avoid processing spans from other projects on the same instance.
health:check — fired when a health check completes
The stream sends keepalive comments (: keepalive) every 15 seconds to prevent proxy timeouts. Connect with the browser EventSource API or any SSE-compatible client.
const source = new EventSource("/api/live/events", {
headers: { "X-API-Key": "ls_your_key" }
});
source.addEventListener("span:new", (e) => {
const span = JSON.parse(e.data);
console.log(`New span: ${span.tool_name} — ${span.status}`);
});
source.addEventListener("health:check", (e) => {
const check = JSON.parse(e.data);
console.log(`Health: ${check.server_name} — ${check.status}`);
});
The SSE broadcaster supports up to 200 concurrent clients with a 50-event buffer per client. If a client falls behind, the oldest events are dropped. For production monitoring dashboards, see Monitoring for Prometheus-based metrics instead.
Prometheus Metrics
| Method | Path | Description |
|---|
GET | /metrics | Prometheus text exposition format — no authentication required |
Available metrics:
| Metric | Type | Labels |
|---|
langsight_http_requests_total | Counter | method, path, status |
langsight_http_request_duration_seconds | Histogram | method, path |
langsight_spans_ingested_total | Counter | — |
langsight_active_sse_connections | Gauge | — |
langsight_health_checks_total | Counter | server, status |
See Monitoring for Prometheus scrape configuration and Grafana dashboard setup.
Anomaly object:
{
"server_name": "jira-mcp",
"tool_name": "get_issue",
"metric": "latency_p99",
"z_score": 4.2,
"current_value": 1842.0,
"baseline_value": 142.0,
"detected_at": "2026-03-19T08:14:00Z"
}
Quick test
# Liveness
curl http://localhost:8000/api/liveness
# Readiness (checks storage)
curl http://localhost:8000/api/readiness
# Latest server health (authenticated)
curl http://localhost:8000/api/health/servers \
-H "X-API-Key: ls_your_key" | jq .
# Send a span
curl -X POST http://localhost:8000/api/traces/spans \
-H "Content-Type: application/json" \
-H "X-API-Key: ls_your_key" \
-d '[{"server_name":"pg","tool_name":"query","started_at":"2026-03-17T12:00:00Z","ended_at":"2026-03-17T12:00:00.042Z","latency_ms":42,"status":"success","project_id":"default"}]'