Skip to main content

ClickHouse unreachable

Symptom: langsight serve or the API returns errors like ClickHouse connection refused or the readiness probe at /api/readiness shows {"status": "not_ready", "storage": {"clickhouse": "error: ..."}} Causes and fixes:
CauseFix
ClickHouse container not runningdocker compose up -d clickhouse — wait 10 seconds then retry
Wrong LANGSIGHT_CLICKHOUSE_URLDefault is http://clickhouse:8123 inside Docker Compose; http://localhost:8123 outside
Wrong credentialsCheck CLICKHOUSE_USER and CLICKHOUSE_PASSWORD match clickhouse-users.xml
Port 8123 not exposedCheck docker compose ps — the clickhouse service must map port 8123
ClickHouse starting up slowlyWait 30 seconds after docker compose up before calling the API
Check ClickHouse logs:
docker compose logs clickhouse --tail=50

PostgreSQL connection failed

Symptom: API exits at startup with asyncpg.InvalidCatalogNameError or Connection refused on port 5432. Causes and fixes:
CauseFix
Postgres container not runningdocker compose up -d postgres
Wrong LANGSIGHT_POSTGRES_URLDefault: postgresql://langsight:langsight@postgres:5432/langsight inside Docker Compose
Database not createdRun docker compose exec postgres createdb -U langsight langsight
Migrations not rundocker compose exec api uv run alembic upgrade head
Check Postgres logs:
docker compose logs postgres --tail=50

Dashboard shows “Auth error” or NextAuth error page

Symptom: After login, the dashboard redirects to an error page or shows “Configuration” or “Authentication” errors. Cause: NEXTAUTH_SECRET (or LANGSIGHT_AUTH_SECRET) is not set or does not match between the Next.js dashboard and the FastAPI API. Fix:
  1. Generate a secret:
    python -c "import secrets; print(secrets.token_hex(32))"
    
  2. Set it in your .env file for both services:
    NEXTAUTH_SECRET=<generated-value>
    LANGSIGHT_AUTH_SECRET=<same-generated-value>
    
  3. Restart both containers:
    docker compose restart api dashboard
    
The secret must be identical on both sides — the Next.js proxy validates the session signature using the same key that signed it.

API returns 401 on all requests

Symptom: Every API call returns {"detail": "Not authenticated"} or {"detail": "Invalid API key"}. Causes and fixes:
CauseFix
LANGSIGHT_API_KEYS not setAdd your key: LANGSIGHT_API_KEYS=ls_yourkey
Wrong header nameUse X-API-Key: ls_yourkey, not Authorization: Bearer ...
Key was revokedUser changed password — generate a new key in Settings → API Keys
Key belongs to deactivated userRe-activate the user in Settings → Users

Spans not appearing in the dashboard

Symptom: Agent code is running but no sessions or spans appear in the dashboard. Causes and fixes:
CauseFix
LANGSIGHT_URL not set or wrongSDK reads this env var; default is http://localhost:8000
LANGSIGHT_TEST_MODE=1 is setRemove this env var — test mode silently drops all spans
LANGSIGHT_PROJECT_ID wrongSpans are stored under the project ID — check the dashboard project selector
API key missingSet LANGSIGHT_API_KEY if the API requires authentication
Wrong API keyThe key must match one of the values in LANGSIGHT_API_KEYS
Firewall blocking outbound to port 8000Agent must be able to reach the API; check network policies
Quick check — send a test span manually:
curl -s http://localhost:8000/api/liveness
# Should return: {"status": "alive"}

/metrics returns 503

Symptom: Prometheus targets show DOWN and scraping returns HTTP 503. Cause: LANGSIGHT_METRICS_TOKEN is not set. When the token is not configured, the /metrics endpoint intentionally returns 503 rather than exposing metrics to any caller. Fix:
# Generate a token
python -c "import secrets; print(secrets.token_hex(32))"

# Add to your environment
LANGSIGHT_METRICS_TOKEN=<generated-value>

# Restart the API
docker compose restart api
Update your Prometheus scrape config to include the token:
scrape_configs:
  - job_name: langsight
    bearer_token: <generated-value>
    static_configs:
      - targets: ["localhost:8000"]
    metrics_path: /metrics
If the token is correct but scraping still fails, verify the token does not contain trailing whitespace.

CORS errors in the browser

Symptom: Browser console shows CORS policy: No 'Access-Control-Allow-Origin' header. Cause: LANGSIGHT_CORS_ORIGINS does not include the dashboard origin. Fix:
# Allow the dashboard origin (adjust port/domain for your setup)
LANGSIGHT_CORS_ORIGINS=https://langsight.example.com,http://localhost:3003
Set this to a comma-separated list of all origins that need to reach the API. The default is http://localhost:3003.
Do not set LANGSIGHT_CORS_ORIGINS=* in production. This allows any website to make authenticated requests to your API using your users’ sessions.

Redis connection refused

Symptom: API fails to start with redis.exceptions.ConnectionError: Error connecting to redis://redis:6379. Causes and fixes:
CauseFix
Redis container not runningdocker compose --profile redis up -d redis
Wrong passwordCheck REDIS_PASSWORD in .env matches the password in LANGSIGHT_REDIS_URL
Wrong hostnameInside Docker Compose: use redis (service name). Outside: use localhost
Forgot --profile redisRedis is an optional profile — start with docker compose --profile redis up -d
Test the connection:
docker compose exec redis redis-cli -a "${REDIS_PASSWORD}" ping
# Expected: PONG
For full Redis setup, monitoring, HA, and performance tuning, see Redis.

Redis timeout errors

Symptom: Intermittent redis.exceptions.TimeoutError in API logs.
CauseFix
Redis overloadedCheck redis-cli info stats — look at instantaneous_ops_per_sec
Network latencyRedis should be on the same host or Docker network as the API
Connection pool exhaustionReduce LANGSIGHT_WORKERS or check for connection leaks
Slow Lua scriptsCheck redis-cli slowlog get 10 for circuit breaker CAS operations

Workers > 1 startup error

Symptom: API fails to start with:
RuntimeError: LANGSIGHT_WORKERS=4 is not safe with the current in-memory rate limiter.
Set LANGSIGHT_REDIS_URL to enable shared state across workers.
Cause: LANGSIGHT_WORKERS is set to more than 1 but LANGSIGHT_REDIS_URL is not configured. Fix: Either set LANGSIGHT_WORKERS=1 (the default) or configure Redis:
# Option 1: Single worker (no Redis needed)
LANGSIGHT_WORKERS=1

# Option 2: Multi-worker with Redis
LANGSIGHT_REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379
LANGSIGHT_WORKERS=4
The in-memory rate limiter gives each worker an independent counter. Under N workers, the effective rate limit becomes limit×N, which breaks brute-force protection. This is enforced at startup to prevent silent misconfiguration. See Redis for setup instructions and Production Hardening for the full multi-worker guide.

Admin bootstrap does not run

Symptom: First run completes but no admin user is created; cannot log in. Causes and fixes:
CauseFix
LANGSIGHT_ADMIN_EMAIL not setSet both LANGSIGHT_ADMIN_EMAIL and LANGSIGHT_ADMIN_PASSWORD before the first run
LANGSIGHT_ADMIN_PASSWORD too shortMust be at least 12 characters
Admin already existsBootstrap only runs when the user table is empty; existing installs are not affected
Migrations not runBootstrap runs after Alembic migrations; run alembic upgrade head first
If the user table already has data and you need to reset it:
# Remove all users (destructive)
docker compose exec postgres psql -U langsight -c "TRUNCATE users, invite_tokens CASCADE;"
# Restart to trigger bootstrap
docker compose restart api