Anomaly Detection
LangSight detects statistically unusual tool behaviour by comparing current metrics against a 7-day rolling baseline using z-score analysis. This catches real problems that threshold-based alerts miss — a tool that’s normally noisy won’t alert; a reliable tool that suddenly spikes will.How it works
For each tool, LangSight computes a baseline (mean + standard deviation) from the last 7 days of hourly data. Current metrics are compared:- Warning: |z| >= 2.0
- Critical: |z| >= 3.0
error_rate— fraction of failed callsavg_latency_ms— mean call latency
API
| Parameter | Default | Description |
|---|---|---|
current_hours | 1 | Time window for current metrics |
baseline_hours | 168 | Baseline window (default 7 days) |
z_threshold | 2.0 | Z-score threshold to fire an anomaly (1.0–5.0) |
Dashboard
The Overview page shows an “Anomalies Detected” card that polls every 60 seconds. It shows the count of current anomalies with a critical/warning breakdown.Requires ClickHouse
Anomaly detection uses themv_tool_reliability materialized view in ClickHouse. It requires storage.mode: clickhouse or storage.mode: dual (the default). Returns an empty list when running mode: postgres only.