What is the scorecard?
The scorecard gives each MCP server a single letter grade (A+ through F) that summarizes its overall reliability. The grade is computed from a weighted score across five dimensions, with hard veto caps that can override the numeric result when a server has a critical problem. Use it to answer: “Which of my MCP servers should I be worried about right now?”Quick look
Single server
The five dimensions
Availability (30%)
7-day rolling window uptime percentage.| Score | Condition |
|---|---|
| 100% | Zero downtime in 7 days |
| Prorated | Each health check failure reduces the score proportionally |
| 0% | All checks failed (server never came up) |
Security (25%)
Derived from the most recentlangsight security-scan result for this server.
| Finding | Point deduction |
|---|---|
| Critical CVE | −25 (maximum deduction; see hard veto) |
| High CVE / OWASP HIGH finding | −15 |
| Medium OWASP finding | −8 |
| Low OWASP finding | −3 |
| No authentication configured | −10 (see hard veto) |
| Confirmed tool poisoning | −25 (maximum deduction; see hard veto) |
Reliability (20%)
Error rate and latency variance over the 7-day window.| Score | Condition |
|---|---|
| 100% | 0% error rate and low p99 variance |
| Reduced | Each percentage point of error rate reduces the score; high variance reduces it further |
| 0% | >50% error rate |
Schema Stability (15%)
Frequency and severity of drift events over the 7-day window.| Score | Condition |
|---|---|
| 100% | No drift events |
| −10 per event | COMPATIBLE drift |
| −20 per event | WARNING drift (description change) |
| −30 per event | BREAKING drift |
Performance (10%)
p99 latency compared to the 30-day baseline.| Score | Condition |
|---|---|
| 100% | p99 within 20% of 30-day baseline |
| Prorated | Each 10% above baseline reduces score proportionally |
| 0% | p99 >3x the 30-day baseline |
Grade thresholds
| Grade | Score |
|---|---|
| A+ | 96–100 (exceptional) |
| A | 90–95 |
| B | 80–89 |
| C | 65–79 |
| D | 50–64 |
| F | < 50 |
Hard veto caps
Certain conditions override the numeric score and lock the grade to a lower value regardless of points:| Condition | Grade cap | Reason |
|---|---|---|
| 10+ consecutive failures | F | Server is effectively unreachable |
| Active critical CVE | F | Unacceptable security risk |
| Confirmed tool poisoning | F | Active exploit; do not use |
| Uptime < 90% over 7 days | D | Unreliable for production use |
| No authentication configured | C | Auth is a baseline security requirement |
| Critical or high security finding | B | Must fix before A-range |
| p99 > 5,000ms | B | Latency is too high for interactive agents |
REST API
Dashboard
The MCP Servers page shows a Grade column alongside the health status. The detail panel has a Scorecard tab showing:- The current grade (large, color-coded letter)
- A dimension breakdown bar chart
- Any active cap with a description and remediation hint
- Grade history over the past 30 days (trend chart)
Sharing a scorecard
Use the--json flag to export for reporting or CI gating:
Improving a scorecard
| Grade | Most likely cause | Fix |
|---|---|---|
| F (consecutive failures) | Server is down | Restart the server process |
| F (critical CVE) | Known vulnerability | Upgrade the server package |
| F (poisoning) | Tool description hijacked | Roll back to last known-good version |
| C (no auth) | Authentication not configured | Add API key or mTLS |
| D (uptime < 90%) | Recurring crashes | Check process manager / k8s restarts |
| B (p99 > 5s) | Slow upstream or DB | Profile and optimize server queries |
Related
- Health Monitoring — availability and performance data
- Schema Drift — schema stability data
- Security Scan — security dimension data
- Scorecard API — REST reference