Scoring
The scoring model defines how repository health is measured. Each dimension contributes a weighted score, and the overall result is a verdict of Pass, Warn, or Fail.
The core idea: make scoring functions concrete. The score system is deterministic and explainable — you should always be able to understand exactly why the score is what it is. There is no hidden heuristic. lexicon verify shows you precisely which dimensions contributed what.
Scoring supports weighted dimensions across areas like:
- correctness
- contract pass rate and coverage
- conformance coverage
- behavior pass rate
- lint quality
- documentation completeness
- panic safety
The scoring model is stored at specs/scoring/model.toml.
Score Model Structure
Section titled “Score Model Structure”A score model contains:
dimensions— A list of scoring dimensions, each with a weight and categorythresholds— Pass and warn thresholds (values between 0.0 and 1.0)
Dimensions
Section titled “Dimensions”Each dimension has:
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier (e.g., correctness) |
label | string | Human-readable label |
weight | u32 | Weight in total score calculation |
category | enum | required, scored, or advisory |
source | enum | Where the value comes from: gate, test_suite, coverage, or manual |
Categories
Section titled “Categories”- Required — Must pass. If any required dimension fails, the overall verdict is Fail regardless of the numeric score.
- Scored — Contributes to the weighted numeric score.
- Advisory — Informational only. Does not affect pass/fail or the numeric score. Advisory dimensions are excluded from the weighted calculation.
Thresholds
Section titled “Thresholds”| Threshold | Default | Meaning |
|---|---|---|
pass | 0.8 | Score at or above this value is Pass |
warn | 0.6 | Score at or above this but below pass is Warn |
Below the warn threshold, the verdict is Fail.
Verdict Logic
Section titled “Verdict Logic”The verdict is determined in order:
- If any required dimension failed, verdict is Fail
- If total score >= pass threshold, verdict is Pass
- If total score >= warn threshold, verdict is Warn
- Otherwise, verdict is Fail
Score Computation
Section titled “Score Computation”The total score is a weighted average of all non-advisory dimensions:
total = sum(dimension.value * dimension.weight) / sum(weights)Advisory dimensions are excluded from both the numerator and denominator.
Each dimension value is between 0.0 and 1.0. For gate-sourced dimensions, a passing gate scores 1.0 and a failing gate scores 0.0.
Safety Against Gaming
Section titled “Safety Against Gaming”The scoring system is designed to resist gaming — both by humans and AI. Silently loosening score thresholds, weakening assertions, or rewriting scoring dimensions without acknowledgment are policy violations that lexicon detects.
Score changes are tracked in audit records with before and after values, so any drift in scoring policy is visible and attributable.
Default Model
Section titled “Default Model”The default model created by lexicon init includes six dimensions:
schema_version = "1.0"
[[dimensions]]id = "correctness"label = "Correctness"weight = 30category = "required"source = "gate"
[[dimensions]]id = "conformance-coverage"label = "Conformance Coverage"weight = 25category = "scored"source = "test_suite"
[[dimensions]]id = "behavior-pass-rate"label = "Behavior Pass Rate"weight = 15category = "scored"source = "test_suite"
[[dimensions]]id = "lint-quality"label = "Lint Quality"weight = 10category = "scored"source = "gate"
[[dimensions]]id = "doc-completeness"label = "Documentation Completeness"weight = 10category = "advisory"source = "manual"
[[dimensions]]id = "panic-safety"label = "Panic Safety"weight = 10category = "scored"source = "gate"
[thresholds]pass = 0.8warn = 0.6Weights sum to 100 across all dimensions, but only non-advisory weights (90) are used in the denominator.
The scoring model is created automatically by lexicon init. Score results are displayed as part of lexicon verify output. For a detailed breakdown, run lexicon verify and review the score section.