Troubleshooting

ClickHouse unreachable

Symptom: langsight serve or the API returns errors like ClickHouse connection refused or the readiness probe at /api/readiness shows {"status": "not_ready", "storage": {"clickhouse": "error: ..."}} Causes and fixes:

Cause	Fix
ClickHouse container not running	`docker compose up -d clickhouse` — wait 10 seconds then retry
Wrong `LANGSIGHT_CLICKHOUSE_URL`	Default is `http://clickhouse:8123` inside Docker Compose; `http://localhost:8123` outside
Wrong credentials	Check `CLICKHOUSE_USER` and `CLICKHOUSE_PASSWORD` match `clickhouse-users.xml`
Port 8123 not exposed	Check `docker compose ps` — the clickhouse service must map port 8123
ClickHouse starting up slowly	Wait 30 seconds after `docker compose up` before calling the API

Check ClickHouse logs:

docker compose logs clickhouse --tail=50

PostgreSQL connection failed

Symptom: API exits at startup with asyncpg.InvalidCatalogNameError or Connection refused on port 5432. Causes and fixes:

Cause	Fix
Postgres container not running	`docker compose up -d postgres`
Wrong `LANGSIGHT_POSTGRES_URL`	Default: `postgresql://langsight:langsight@postgres:5432/langsight` inside Docker Compose
Database not created	Run `docker compose exec postgres createdb -U langsight langsight`
Migrations not run	`docker compose exec api uv run alembic upgrade head`

Check Postgres logs:

docker compose logs postgres --tail=50

Dashboard shows “Auth error” or NextAuth error page

Symptom: After login, the dashboard redirects to an error page or shows “Configuration” or “Authentication” errors. Cause: NEXTAUTH_SECRET (or LANGSIGHT_AUTH_SECRET) is not set or does not match between the Next.js dashboard and the FastAPI API. Fix:

Generate a secret:

python -c "import secrets; print(secrets.token_hex(32))"

Set it in your .env file for both services:

NEXTAUTH_SECRET=<generated-value>
LANGSIGHT_AUTH_SECRET=<same-generated-value>

Restart both containers:
```
docker compose restart api dashboard
```

The secret must be identical on both sides — the Next.js proxy validates the session signature using the same key that signed it.

API returns 401 on all requests

Symptom: Every API call returns {"detail": "Not authenticated"} or {"detail": "Invalid API key"}. Causes and fixes:

Cause	Fix
`LANGSIGHT_API_KEYS` not set	Add your key: `LANGSIGHT_API_KEYS=ls_yourkey`
Wrong header name	Use `X-API-Key: ls_yourkey`, not `Authorization: Bearer ...`
Key was revoked	User changed password — generate a new key in Settings → API Keys
Key belongs to deactivated user	Re-activate the user in Settings → Users

Spans not appearing in the dashboard

Symptom: Agent code is running but no sessions or spans appear in the dashboard. Causes and fixes:

Cause	Fix
`LANGSIGHT_URL` not set or wrong	SDK reads this env var; default is `http://localhost:8000`
`LANGSIGHT_TEST_MODE=1` is set	Remove this env var — test mode silently drops all spans
`LANGSIGHT_PROJECT_ID` wrong	Spans are stored under the project ID — check the dashboard project selector
API key missing	Set `LANGSIGHT_API_KEY` if the API requires authentication
Wrong API key	The key must match one of the values in `LANGSIGHT_API_KEYS`
Firewall blocking outbound to port 8000	Agent must be able to reach the API; check network policies

Quick check — send a test span manually:

curl -s http://localhost:8000/api/liveness
# Should return: {"status": "alive"}

/metrics returns 503

Symptom: Prometheus targets show DOWN and scraping returns HTTP 503. Cause: LANGSIGHT_METRICS_TOKEN is not set. When the token is not configured, the /metrics endpoint intentionally returns 503 rather than exposing metrics to any caller. Fix:

# Generate a token
python -c "import secrets; print(secrets.token_hex(32))"

# Add to your environment
LANGSIGHT_METRICS_TOKEN=<generated-value>

# Restart the API
docker compose restart api

Update your Prometheus scrape config to include the token:

scrape_configs:
  - job_name: langsight
    bearer_token: <generated-value>
    static_configs:
      - targets: ["localhost:8000"]
    metrics_path: /metrics

If the token is correct but scraping still fails, verify the token does not contain trailing whitespace.

CORS errors in the browser

Symptom: Browser console shows CORS policy: No 'Access-Control-Allow-Origin' header. Cause: LANGSIGHT_CORS_ORIGINS does not include the dashboard origin. Fix:

# Allow the dashboard origin (adjust port/domain for your setup)
LANGSIGHT_CORS_ORIGINS=https://langsight.example.com,http://localhost:3003

Set this to a comma-separated list of all origins that need to reach the API. The default is http://localhost:3003.

Do not set LANGSIGHT_CORS_ORIGINS=* in production. This allows any website to make authenticated requests to your API using your users’ sessions.

Redis connection refused

Symptom: API fails to start with redis.exceptions.ConnectionError: Error connecting to redis://redis:6379. Causes and fixes:

Cause	Fix
Redis container not running	`docker compose --profile redis up -d redis`
Wrong password	Check `REDIS_PASSWORD` in `.env` matches the password in `LANGSIGHT_REDIS_URL`
Wrong hostname	Inside Docker Compose: use `redis` (service name). Outside: use `localhost`
Forgot `--profile redis`	Redis is an optional profile — start with `docker compose --profile redis up -d`

Test the connection:

docker compose exec redis redis-cli -a "${REDIS_PASSWORD}" ping
# Expected: PONG

For full Redis setup, monitoring, HA, and performance tuning, see Redis.

Redis timeout errors

Symptom: Intermittent redis.exceptions.TimeoutError in API logs.

Cause	Fix
Redis overloaded	Check `redis-cli info stats` — look at `instantaneous_ops_per_sec`
Network latency	Redis should be on the same host or Docker network as the API
Connection pool exhaustion	Reduce `LANGSIGHT_WORKERS` or check for connection leaks
Slow Lua scripts	Check `redis-cli slowlog get 10` for circuit breaker CAS operations

Workers > 1 startup error

Symptom: API fails to start with:

RuntimeError: LANGSIGHT_WORKERS=4 is not safe with the current in-memory rate limiter.
Set LANGSIGHT_REDIS_URL to enable shared state across workers.

Cause: LANGSIGHT_WORKERS is set to more than 1 but LANGSIGHT_REDIS_URL is not configured. Fix: Either set LANGSIGHT_WORKERS=1 (the default) or configure Redis:

# Option 1: Single worker (no Redis needed)
LANGSIGHT_WORKERS=1

# Option 2: Multi-worker with Redis
LANGSIGHT_REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379
LANGSIGHT_WORKERS=4

The in-memory rate limiter gives each worker an independent counter. Under N workers, the effective rate limit becomes limit×N, which breaks brute-force protection. This is enforced at startup to prevent silent misconfiguration. See Redis for setup instructions and Production Hardening for the full multi-worker guide.

Admin bootstrap does not run

Symptom: First run completes but no admin user is created; cannot log in. Causes and fixes:

Cause	Fix
`LANGSIGHT_ADMIN_EMAIL` not set	Set both `LANGSIGHT_ADMIN_EMAIL` and `LANGSIGHT_ADMIN_PASSWORD` before the first run
`LANGSIGHT_ADMIN_PASSWORD` too short	Must be at least 12 characters
Admin already exists	Bootstrap only runs when the user table is empty; existing installs are not affected
Migrations not run	Bootstrap runs after Alembic migrations; run `alembic upgrade head` first

If the user table already has data and you need to reset it:

# Remove all users (destructive)
docker compose exec postgres psql -U langsight -c "TRUNCATE users, invite_tokens CASCADE;"
# Restart to trigger bootstrap
docker compose restart api

Configuration Reference — all environment variables
Redis — Redis setup, HA, monitoring, and troubleshooting
Production Hardening — reverse proxy, firewall, TLS
Backup & Restore — recover from data loss
Docker Compose — full stack setup

Getting Started

CLI Reference

AI Providers

SDK & Integrations

Guides

MCP Monitoring

Agents

Reliability Features

Teams & Access

Self-Hosting

ClickHouse unreachable

PostgreSQL connection failed

Dashboard shows “Auth error” or NextAuth error page

API returns 401 on all requests

Spans not appearing in the dashboard

/metrics returns 503

CORS errors in the browser

Redis connection refused

Redis timeout errors

Workers > 1 startup error

Admin bootstrap does not run

Getting Started

CLI Reference

AI Providers

SDK & Integrations

Guides

MCP Monitoring

Agents

Reliability Features

Teams & Access

Self-Hosting

​ClickHouse unreachable

​PostgreSQL connection failed

​Dashboard shows “Auth error” or NextAuth error page

​API returns 401 on all requests

​Spans not appearing in the dashboard

​/metrics returns 503

​CORS errors in the browser

​Redis connection refused

​Redis timeout errors

​Workers > 1 startup error

​Admin bootstrap does not run

​Related

ClickHouse unreachable

PostgreSQL connection failed

Dashboard shows “Auth error” or NextAuth error page

API returns 401 on all requests

Spans not appearing in the dashboard

/metrics returns 503

CORS errors in the browser

Redis connection refused

Redis timeout errors

Workers > 1 startup error

Admin bootstrap does not run

Related