Capabilities as code for AI-native review

Review what changed in behavior, not only what changed in code.

CapabilityKit gives developers a repo-native way to understand capability diffs, verify whether implementation really covers intent, and see what downstream behavior may be affected by a small change.

Plans often diverge from implementation. Code diffs rarely explain intent.

AI agents make implementation cheaper, but they also make it easier for important product decisions to disappear into generated code. CapabilityKit keeps the durable part of the plan: what the system is supposed to do, where it is implemented, how deeply it is verified, and what depends on it.

Developer process

A review loop for capability changes.

CapabilityKit is designed for the moment when a developer or reviewer needs to understand an AI-assisted change without reconstructing requirements from prompts, stale plans, and implementation details.

1

Read the capability diff

See added, changed, and removed capability intent, acceptance, verification, references, and review policy.

2

Assess implementation coverage

Compare every acceptance criterion with evidence from the referenced source, test, and documentation files.

3

Inspect dependency impact

Traverse direct and transitive dependents to find related capabilities that may need checks or review.

4

Grow verification

Add tests, manual review evidence, agent review results, or explicit accepted gaps before confidence decays.

Why this matters

Planning documents are not enough after an agent writes the code.

  • Plans often record code decisions, not the lasting capability contract.
  • Generated implementation can drift from the plan before review begins.
  • Reviewers need to know whether new and existing capabilities are actually verified.
  • A simple capability edit can change downstream agent prompts, CLI behavior, compiled artifacts, or docs.

Capability diff

What behavior changed?

`capabilitykit diff` summarizes intent, acceptance, verification, implementation reference, and ignore policy changes against a Git base. It excludes noisy saved review evidence by default.

capabilitykit diff HEAD
 intent changed
 acceptance +2/-0
 verification +1/-0
 Impact: 3 direct, 7 transitive

Verification depth

How strong is the evidence?

`capabilitykit assess` reads declared implementation references and places each acceptance criterion beside concrete evidence. Uncertain findings stay visible until semantic review, tests, or accepted gaps resolve them.

covered: status summary exists
uncertain: impact evidence found
uncovered: no semantic review saved

Impact graph

What else may break?

`capabilitykit impact` follows explicit `agent.depends_on` relationships and collects suggested automated checks, manual review steps, and known verification gaps across the impacted set.

Direct dependents: 5
Transitive dependents: 9
Suggested checks: npm test, compile

Repo-native structure

The capability map is hierarchical before it is a graph.

Capability files live in `.capabilities/` using folders that mirror ownership, product areas, or platform layers. The hierarchy gives reviewers a readable map before they inspect dependency edges.

.capabilities/
  capabilitykit.yaml
  core/
    model/
      define-capability-format.capability.yaml
    validation/
      validate-capability-files.capability.yaml
      detect-verification-gaps.capability.yaml
    graph/
      compile-capabilities.capability.yaml
      diff-capabilities.capability.yaml
      analyze-capability-impact.capability.yaml
    assessment/
      assess-implementation-coverage.capability.yaml
  developer-experience/
    cli/
    skills/
  docs/
    project/
    reference/

Capability anatomy

The human-facing fields describe the contract.

The top-level capability fields capture identity, current state, purpose, reviewer guidance, and acceptance criteria. Agent-maintained references and verification can live below this, but the first thing a reviewer should see is the product behavior being claimed.

id: core/graph/diff-capabilities
title: Diff capabilities
status: implemented
area: core
summary: Compare current capability files with a Git base and summarize added,
  changed, and removed capabilities.
intent: Help developers understand product and agent-facing intent changes
  without reading raw YAML diffs.
acceptance:
  - Compares current capabilities against a configurable Git base ref.
  - Reports added, changed, and removed capabilities by capability ID.
  - Summarizes meaningful field changes such as status, intent, acceptance,
    dependencies, implementation references, verification, and ignore policy.
  - Includes downstream impact context for changed capabilities.
guidance:
  - Compare normalized parsed capabilities, not raw YAML text.
  - Avoid raw JSON in the default human output.

Capability dependency graph

Small changes can have wide capability impact.

Folders help teams navigate ownership, but explicit dependencies tell reviewers what behavior relies on a capability. That graph turns a local change into an impact report with checks and manual review guidance.

Capability format
Compile capabilities
Diff review
Agent task bundles
CLI workflow

Open source

Start reviewing capabilities as code.

Add a `.capabilities/` folder, validate the map, diff capability changes, assess implementation coverage, and use the dependency graph to review impact.

npm install -g @capabilitykit/cli capabilitykit init capabilitykit status