AI Agents

Lexicon enables safe AI-assisted development. It generates structured context so AI agents understand system constraints, and enforces boundaries so agents evolve software without violating system law.

The core idea: make AI useful, but bounded and auditable. The architecture assumes AI can make “locally clever but globally bad” choices and defends against that. AI is never required for core verification — the deterministic pipeline (contracts, conformance, scoring, gates) is the source of truth.

What AI Agents Learn

Lexicon generates context so that an AI agent can immediately understand:

what the repo does
what behavior is stable
what can and cannot change
how to add tests
how to interpret conformance
how score is computed
what gates are mandatory
what changes require spec updates
how to work safely in the repo

This context is assembled from contracts, scoring models, gates, and policy into a structured summary. The SYNC_CLAUDE action in lexicon chat writes this into CLAUDE.md as managed blocks that stay current with the repo state.

What AI Agents Must Not Do

AI agents must respect system law. They cannot:

weaken contracts or remove invariants
delete tests to satisfy gates
violate architecture rules
silently loosen score thresholds
weaken required gates without policy approval
rewrite contract semantics without updating status/history
game performance baselines
hide failures behind skipped tests
bypass ecosystem governance for local convenience
introduce forbidden dependency edges
move a repo across architectural layers silently
redefine shared contracts without history/policy updates
modify interfaces with many downstream consumers without surfacing impact

The system builds architectural protections for all of these.

Edit Policy

The manifest’s PolicyConfig controls what AI may edit at the file level:

Policy Level	Default Patterns	Behavior
`ai_may_edit`	`src/*/.rs`, `tests/*/.rs`	AI may freely edit these files
`ai_requires_review`	`specs/*/.toml`, `CLAUDE.md`	AI changes require manual review
`ai_protected`	`.lexicon/manifest.toml`, `specs/gates.toml`	AI must never edit these

Files not matching any pattern default to RequiresReview. The system checks patterns in order: protected first, then requires-review, then allowed.

Safety Mechanisms

Gate Weakening Detection

Changes that lower a gate’s category or make a non-skippable gate skippable are flagged as weakening. The gate_weakening_requires_approval policy (default: true) requires explicit approval.

Test Deletion Approval

The test_deletion_requires_approval policy (default: true) prevents AI from silently removing tests.

Audit Trail

Every AI-driven change is recorded as an audit record with:

The action type (AiImprove or AiImproveRejected)
Actor set to Ai
Content hashes before and after
Score impact (before and after)
Whether gates still pass

Conversation memory is not hidden magic — it is inspectable. Every session is local to the repo and reviewable.

The AiProvider Trait

AI integration is abstracted behind a trait:

pub trait AiProvider {
    fn enhance_proposal(&self, prompt: &str, context: &str) -> AiResult<String>;
    fn suggest_improvement(&self, context: &str, failure: &str) -> AiResult<String>;
}

Two implementations are available:

NoOpProvider — the default when no AI is configured. Returns AiError::NotAvailable for all calls. This ensures lexicon works completely without any AI service.
ClaudeClient — a live client that calls the Claude Messages API using OAuth credentials stored by lexicon auth login. Supports configurable model selection and 120-second request timeouts.

AI is an enhancement layer, not a dependency.

Authentication

AI features require browser-based OAuth authentication:

lexicon auth login     # opens browser, stores credentials
lexicon auth status    # check auth status
lexicon auth refresh   # refresh expired tokens
lexicon auth logout    # remove stored credentials

Credentials are stored per-provider in .lexicon/auth/ with restrictive file permissions (0600 on Unix). The auth system supports both Claude and OpenAI providers.

Intent-Driven Generation

Within a lexicon chat session, describe what you want to build in natural language. The AI generates artifacts using action directives:

you> I need an async key-value store with TTL
▸ Creating contract...
✓ Created contract: kv-store (specs/contracts/kv-store.toml)

The generation pipeline:

Assembles repo context (contracts, scoring, gates, manifest)
Builds a specialized prompt with artifact templates
Calls the AI provider
Writes artifacts to disk as they’re created
Records every action in the audit trail

All generated artifacts follow the same schemas as manually-created ones. The AI can create contracts, conformance tests, behavior scenarios, property tests, fuzz targets, edge case tests, and implementation prompts.

Context Assembly

The assemble_context function creates a structured text summary from repository state:

# Project: my-lib
Domain: key-value store
Type: Library

## Contracts
- **KV Store** (kv): Basic KV [status: Active, stability: Stable]
  Invariants:
  - inv-001: Keys set must be retrievable

## Scoring
Pass threshold: 80%, Warn threshold: 60%
- Correctness (weight: 30, Required)

## Gates
- Format Check (Required): `cargo fmt -- --check`
- Clippy Lints (Required): `cargo clippy -- -D warnings`

At ecosystem scale, context also includes repo role, architecture layer, allowed and forbidden dependencies, shared contracts referenced, cross-repo responsibilities, and compatibility constraints.

Claude Code Sync

The SYNC_CLAUDE action in lexicon chat generates and maintains a CLAUDE.md file with explicit managed blocks containing:

active contracts and their invariants
scoring dimensions and thresholds
required and advisory gates
safe edit zones from the edit policy
stability boundaries

The managed blocks are updated repeatably while preserving any user-authored content outside the managed sections.