API Extraction
API extraction is the process of scanning Rust source files to build a structured snapshot of the public API surface. This snapshot captures every public struct, enum, trait, function, constant, type alias, and module — along with their signatures, visibility, doc summaries, and source locations.
The purpose is twofold: detect unintended API drift, and classify changes as breaking, dangerous, or additive before they ship.
How It Works
Section titled “How It Works”The extraction engine uses the syn crate to parse Rust source files into an AST, then walks the tree with a visitor that collects public items. For each item it records:
| Field | Description |
|---|---|
kind | Struct, Enum, Trait, Function, Method, Module, Constant, TypeAlias, or Impl |
name | The item identifier |
module_path | Nested module path (e.g., ["outer", "inner"]) |
signature | The full type or function signature |
visibility | Public, Crate, Restricted, or Private |
doc_summary | First line of the doc comment, if present |
span_file | Source file path |
span_line | Line number in the source file |
Extraction can target a single file, a single source string, or an entire directory (recursively walking all .rs files). Files that fail to parse are silently skipped so that a single syntax error does not block the entire scan.
The result is an ApiSnapshot — a JSON-serializable structure containing the crate name, optional version, all extracted items, and a timestamp.
Baseline Management
Section titled “Baseline Management”A baseline is a saved API snapshot that represents the “known good” state of your public API. Baselines are stored as JSON files in .lexicon/api/.
The workflow:
- Scan — Extract the current API and save it as
current.json - Baseline — Promote the current scan to
baseline.json - Diff — Compare a new scan against the saved baseline
Baselines let you answer the question: “has the public API changed since the last time I explicitly approved it?”
API Drift Detection
Section titled “API Drift Detection”The diff engine compares two snapshots item by item, keyed on (kind, name, module_path):
- Added items — present in current but not in baseline
- Removed items — present in baseline but not in current
- Changed items — present in both but with different signatures or visibility
For changed items, the diff records the specific fields that changed (signature, visibility, or both).
Breaking Change Classification
Section titled “Breaking Change Classification”Every change is classified into one of four levels:
| Level | Meaning | Examples |
|---|---|---|
| Breaking | Downstream code will fail to compile | Removing a public item, narrowing visibility (e.g., pub to pub(crate)) |
| Dangerous | May break downstream code depending on usage | Changing a function signature |
| Additive | Safe for downstream code | Adding a new public item, widening visibility |
| Unchanged | No difference | Item is identical in both snapshots |
Removed items are always classified as breaking. Visibility changes are breaking if the new visibility is more restrictive than the old one, and additive otherwise. Signature changes are classified as dangerous because they may or may not break callers depending on the specific change.
Diff Reports
Section titled “Diff Reports”The diff engine produces both human-readable and JSON reports. The human-readable report groups changes by breaking level:
API Diff Summary: 1 added, 1 removed, 1 changed============================================================
[BREAKING] Removed items: - function old_helper (pub)
[DANGEROUS] Changed items: ~ function process signature: fn process(x: i32) -> fn process(x: i32, y: i32)
[ADDITIVE] Added items: + struct NewConfig (pub)The JSON report contains the full structured diff, suitable for programmatic consumption in CI pipelines.
Integration with Verify
Section titled “Integration with Verify”API extraction fits into the broader lexicon verification pipeline. API drift is automatically checked during lexicon verify. Use the API_SCAN and API_BASELINE actions in lexicon chat to scan your API and save baselines interactively.