Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (cclaw-cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

cclaw

Install once, ship every time. cclaw is an installer-first workflow runtime that gives your AI coding agent one inspectable path from idea to shipped PR — plus an automatic closeout chain that turns every ship into reusable knowledge:

brainstorm → scope → design → spec → plan → tdd → review → ship
↳ auto-closeout: retro → compound → archive (resumable, never manual)

Every stage has real gates the agent cannot skip, every decision leaves a file-backed audit trail, and the same slash commands work across Claude Code, Cursor, OpenCode, and OpenAI Codex.

You install cclaw once from the terminal, then everything happens inside your harness — no hidden control plane, no background daemon, no operational knobs to memorise.

Who this is for

Solo builders who want shipped outcomes instead of endless chat.
Engineering teams that need a single, repeatable path for AI-assisted changes across multiple harnesses and languages.
Staff engineers and tech leads who want enforceable discipline: locked-in decisions, no placeholders, mandatory TDD, traceable plans.
Maintainers of AI agents/skills who want measurable prompt engineering via the built-in eval harness.

How it works

  ┌─────────┐   ┌──────┐   ┌────────┐   ┌──────┐   ┌──────┐
  │  Idea   │ → │ /cc  │ → │Classify│ → │Track │ → │Stages│
  └─────────┘   └──────┘   └────────┘   └──────┘   └──┬───┘
                                                      │
      ┌───────────────────────────────────────────────┘
      ▼
  brainstorm → scope → design → spec → plan → tdd → review → ship
      │          │        │       │      │      │      │       │
      ▼          ▼        ▼       ▼      ▼      ▼      ▼       ▼
   01.md      02.md    03.md   04.md  05.md  06.md  07.md   08.md
      │                                                      │
      └──── gates + subagents + knowledge capture             │
            happen at every step                              │
                                                              ▼
                                              ┌─────────────────────────┐
                                              │  automatic closeout     │
                                              │  (resumable state       │
                                              │   machine, never manual)│
                                              └────────────┬────────────┘
                                                           ▼
                                        retro ──► compound ──► archive
                                        09.md     (knowledge  (runs/
                                                   promoted)   <slug>/)

Every stage reads and writes real files under .cclaw/. flow-state.json holds the single source of truth for "where are we" — including closeout.shipSubstate so a ship → retro → compound → archive chain resumes at the exact step after any interruption. knowledge.jsonl accumulates reusable lessons throughout the flow, not only at the end; stage artifacts live under .cclaw/artifacts/ until archive rolls them into .cclaw/runs/<date-slug>/.

You ──► /cc <idea>
        │
        ▼
   harness loads stage contract + HARD-GATE
        │
        ▼
   cclaw reads state + knowledge, guides execution
        │
        ▼
   artifacts written, checkpoint saved
        │
        ▼
   next stage is explicit in flow-state.json

30-second install

npx cclaw-cli

Interactive setup will pick which harnesses to install into. For CI or scripted installs:

npx cclaw-cli init --harnesses=claude,cursor --no-interactive

That's the entire CLI interaction. Everything after install happens inside your harness (Claude Code, Cursor, OpenCode, or Codex).

What gets generated

.cclaw/
├── commands/           # stage + utility command contracts (markdown)
├── skills/             # stage + utility skills loaded by the harness
├── contexts/           # cross-cutting context modes (research, debugging, …)
├── templates/          # artifact skeletons for each stage
├── rules/              # lint-style rules surfaced to the agent
├── adapters/           # per-harness translation notes
├── agents/             # subagent definitions (planner, reviewer, …)
├── hooks/              # harness-agnostic hook scripts
├── worktrees/          # isolated feature worktrees (power-user, via /cc-ops)
├── artifacts/          # active feature artifacts (00-idea.md → 09-retro.md)
├── runs/               # archived feature snapshots: YYYY-MM-DD-slug/
├── references/         # (optional) pinned copies of reference frameworks
├── evals/              # eval corpus, rubrics, baselines, reports
├── custom-skills/      # user-authored skills (never overwritten)
├── state/              # flow-state.json + delegation-log.json + activity
└── knowledge.jsonl     # append-only, strict-schema lessons + patterns

Plus harness-specific shims:

.claude/commands/cc*.md + .claude/hooks/hooks.json
.cursor/commands/cc*.md + .cursor/hooks.json + .cursor/rules/cclaw-workflow.mdc
.opencode/commands/cc*.md + .opencode/plugins/cclaw-plugin.mjs
.agents/skills/cc*/SKILL.md + .codex/hooks.json (Codex; skills are activated via /use cc or description-based auto-matching. Hooks require Codex CLI ≥ v0.114 and [features] codex_hooks = true in ~/.codex/config.toml; cclaw init --codex offers to patch that flag for you. .codex/commands/ and the legacy .agents/skills/cclaw-cc*/ folders are auto-cleaned on sync.)
AGENTS.md with a managed routing block (includes a Codex-specific note)

`.cclaw/config.yaml` — the minimal surface

cclaw init writes five keys, on purpose:

version: 0.46.0
flowVersion: 1.0.0
harnesses:
  - codex
strictness: advisory     # advisory | strict — one knob for prompt-guard + TDD
gitHookGuards: false     # opt in to managed .git/hooks/pre-commit + pre-push

If cclaw detects a Node / Python / Go project at init time, a sixth languageRulePacks line appears (auto-populated from package.json, pyproject.toml / requirements.txt, go.mod). That is the full default surface — a new user sees nothing they need to understand yet.

Advanced knobs (promptGuardMode / tddEnforcement per-axis overrides, tddTestGlobs, defaultTrack, trackHeuristics, sliceReview) are opt-in: add them by hand when you need them. cclaw upgrade preserves exactly what you wrote — it never silently reintroduces defaults you removed.

Full key-by-key reference: docs/config.md.

The four commands you actually use

All four appear as slash commands in every supported harness. This is the top-level user surface — everything else is either automatic or happens inside /cc-ops subcommands.

Command	What it does
`/cc <idea>`	Classify the task, discover origin docs (`docs/prd/**`, ADRs, root `PRD.md`, …), sniff the stack, recommend a track, then start the first stage of that track. `/cc` without arguments resumes the current flow.
`/cc-next`	The one progression primitive. Reads `flow-state.json`, checks gates + mandatory subagent delegations, and either resumes the current stage or advances to the next. `/cc-next` in a new session is how you resume.
`/cc-ideate`	Repository improvement discovery. Scans for TODOs, flaky tests, oversized modules, docs drift, and recurring knowledge-store lessons, persists the ranked backlog to `.cclaw/artifacts/ideation-<date>-<slug>.md`, and ends with a concrete handoff: launch `/cc` on the selected candidate in the same session, save-and-close, or discard. Resume check on next run reuses any ideation artifact younger than 30 days. Never mutates `flow-state.json`.
`/cc-view`	Read-only flow visibility. `/cc-view status` (default) shows stage progress, mandatory delegations with their fulfillment mode (isolated / generic-dispatch / role-switch), the ship closeout substate (retro → compound → archive), and the active harness parity row. `/cc-view tree` renders the same picture as a tree with a closeout sub-branch under ship and a per-harness playbook summary. `/cc-view diff` shows stage/gate/closeout/delegation deltas since the last run. Never mutates state (except diff's snapshot baseline).

Power-user surface: /cc-ops is an operational router for manual overrides (rewind a stale stage, manage parallel features, re-run a compound pass). /cc-learn is the strict-schema knowledge writer — agents call it automatically from completion protocols; you rarely invoke it by hand.

Example first-run

> /cc Add rate limiting to the public /api/v1/search endpoint

cclaw:  Classifying task…
        Class: software-medium
        Discovered context: docs/rfcs/rate-limit-strategy.md (rate-limit policy draft)
        Stack: node 20.10.0 (pnpm), fastify 4.26, redis 7
        Recommended track: medium (matched triggers: "add endpoint")
        Override? (A) keep medium  (B) switch track  (C) cancel
> A
cclaw:  Persisting flow-state.json, seeding 00-idea.md, entering brainstorm…

After this flow-state.json contains:

{
  "currentStage": "brainstorm",
  "track": "medium",
  "skippedStages": ["scope", "design"],
  "stageGateCatalog": { "brainstorm": { "passed": [], "blocked": [] } },
  "completedStages": []
}

And 00-idea.md starts with:

Class: software-medium
Track: medium (matched: "add endpoint")
Stack: node 20.10.0, fastify 4.26, redis 7

## Discovered context

- docs/rfcs/rate-limit-strategy.md — rate-limit policy draft (Q2 2026)

## User prompt

Add rate limiting to the public /api/v1/search endpoint

No magic. No ambiguity about where you are.

The eight stages, the three tracks, and auto-closeout

cclaw has eight critical-path stages plus an automatic three-step closeout chain (retro → compound → archive). A single prompt rarely needs all eight critical-path stages, so /cc picks a track up front so the flow matches the task.

Track	Critical path	Typical trigger
quick	`spec → tdd → review → ship`	`bug`, `hotfix`, `typo`, `rename`, `bump`, `docs only`, one-liners
medium	`brainstorm → spec → plan → tdd → review → ship`	`add endpoint`, `add field`, `extend existing`, `wire integration`
standard (default)	all 8 stages	`new feature`, `refactor`, `migration`, `platform`, `schema`, `architecture`

Every track ends with the same auto-closeout chain. Once ship completes, /cc-next automatically drives retro → compound → archive — with closeout.shipSubstate carrying the exact step across sessions, so a crashed or backgrounded run resumes without re-drafting retros or re-asking structured questions. See Ship and closeout.

Each critical-path stage produces a dated artifact under .cclaw/artifacts/: 00-idea.md (seed), 01-brainstorm.md through 08-ship.md. Closeout adds 09-retro.md; archive then rolls the whole bundle into .cclaw/runs/<YYYY-MM-DD-slug>/ and resets the active flow for the next feature.

Track heuristics are configurable (advisory)

Every team has its own vocabulary. Override the built-in trigger lists in .cclaw/config.yaml:

trackHeuristics:
  fallback: standard
  tracks:
    quick:
      triggers: [hotfix, rollback, prod-incident]
      veto: [schema, migration]   # never route quick even if a trigger hits
    standard:
      triggers: [epic, platform-team, core-infra]

Honest caveat: this config is advisory. cclaw surfaces these lists in the /cc skill and contract prose so the LLM applies them during classification — there is no Node-level router that mechanically enforces the outcome. That is why the knobs are deliberately minimal: per-track triggers + veto on top of defaults, plus fallback. Evaluation order is fixed (standard -> medium -> quick, narrow-to-broad); regex patterns and a priority override were removed in v0.38.0 because nothing in runtime consumed them.

Mid-flow reclassification

If you seed a task as quick and evidence in spec shows it actually needs a schema migration, cclaw stops and asks before quietly advancing. Reclassification is append-only: the old decision stays in history.

Guardrails that ship in the box

These are the things that make cclaw "enterprise-strong" without turning it into ceremony:

Locked decisions (D-XX IDs). Scope decisions are numbered and must reappear in plan + TDD artifacts. The artifact linter catches any silent drift.
No placeholders. TBD, TODO, similar to task, and "static for now"-style scope reduction are flagged before a stage completes.
Stale-stage detection. If an upstream artifact changes after a downstream stage is already complete, cclaw marks the downstream stage stale and refuses to advance until you re-run it (or explicitly acknowledge via a manual override).
Mandatory subagent delegation at TDD, with per-harness waivers.
Turn Announce Discipline. Every stage entry/exit emits a visible line so users can see what the agent is doing, not just what it says.
Extracted protocols. Decision, Completion, and Ethos protocols live in a single place (.cclaw/contexts/), so every skill speaks the same dialect.
Knowledge capture throughout the flow. Every stage completion protocol emits typed entries (rule / pattern / lesson) to .cclaw/knowledge.jsonl as the flow progresses — not only at retro. Retro itself adds a compound entry, and the automatic compound pass after ship promotes recurring entries (≥ 3) into first-class rules/protocols/skills so the next run is easier. Strict JSONL schema keeps the whole thing machine-queryable.
Automatic integrity checks. Runtime health is verified on every stage transition — no command you need to remember to run.

TDD that actually runs

The tdd stage is not prose guidance. It requires:

an explicit RED test run (logged to .cclaw/state/stage-activity.jsonl)
a mandatory test-author subagent dispatch (logged to .cclaw/state/delegation-log.json)
a GREEN full-suite run before exit
optional REFACTOR pass with coverage preservation

/cc-next will not advance past tdd until the delegation log shows the subagent as completed (or, on Codex / OpenCode, role-switched with evidenceRefs — see Harness support).

Ship and closeout — automatic, resumable

Shipping writes 08-ship.md. /cc-next then automatically walks the feature through a deterministic three-step closeout without extra commands from you:

Retro (09-retro.md). cclaw drafts a retrospective from your stage artifacts, the delegation log, and the knowledge entries recorded during the run. It then asks exactly one structured question:
- accept (default) — keep the draft, record one compound knowledge entry, advance.
- edit — you edit 09-retro.md in place, then /cc-next again.
- skip — record a one-line reason, continue (archive will surface the skip in the run manifest).
Compound pass. If the knowledge store has clusters recurring 3+ times, cclaw proposes concrete lifts into rules/protocols/skills and asks once: apply-all / apply-selected / skip. An empty pass advances silently.
Archive. Moves artifacts into .cclaw/runs/YYYY-MM-DD-<slug>/, snapshots state/, writes a manifest, and resets flow-state.json to the track's initial stage.

The chain is driven by closeout.shipSubstate inside flow-state.json (retro_review → compound_review → ready_to_archive → archived). If your session dies mid-closeout, a new /cc-next resumes at the exact step — retro drafts are not regenerated and no structured ask is repeated silently.

You can still invoke each step manually (/cc-ops retro, /cc-ops compound, /cc-ops archive), but for the default path you do not need to: /cc-next is the only command.

Harness support

cclaw is honest about what each harness can and cannot do, and it closes every real gap with a documented fallback — not a silent waiver.

Harness	Dispatch	Fallback	Hook surface	Structured ask	Playbook
Claude Code	full (named subagents)	`native`	full	`AskUserQuestion`	`claude-playbook.md`
Cursor	generic Task dispatcher	`generic-dispatch`	full	`AskQuestion`	`cursor-playbook.md`
OpenCode	plugin / in-session	`role-switch`	plugin	`question` (permission-gated; `permission.question: "allow"`)	`opencode-playbook.md`
OpenAI Codex	in-session only	`role-switch` (evidenceRefs required)	limited (Bash-only `PreToolUse`/`PostToolUse`; requires `codex_hooks` feature flag)	`request_user_input` (experimental; Plan / Collaboration mode)	`codex-playbook.md`

What the fallbacks mean:

native — Claude runs mandatory delegations in isolated subagent workers; cclaw records them with fulfillmentMode: "isolated".
generic-dispatch — Cursor has a real Task tool with a fixed vocabulary of subagent_types (explore, generalPurpose, …). cclaw maps each named agent (planner / reviewer / test-author / security-reviewer / doc-updater) onto the generic dispatcher with a structured role prompt. Per-agent mapping lives in the Cursor playbook.
role-switch — OpenCode and Codex lack an isolated worker primitive. The agent announces the role in-session, performs the work, and records a delegation row with fulfillmentMode: "role-switch" and at least one evidenceRef pointing at the artifact section that captures the output. Under role-switch, a completed row without evidenceRefs is classified as missingEvidence by cclaw doctor and blocks stage completion.
waiver — reserved. Only fires auto-waivers if every installed harness declares it. Currently unused — v0.33 removed the old Codex-only auto-waiver path.

Codex note (v0.40+). Codex CLI deprecated custom prompts in v0.89 (Jan 2026), but Codex ≥ v0.114 (Mar 2026) grew an experimental lifecycle hooks API. cclaw installs Codex entry points as native skills under .agents/skills/cc*/SKILL.md (invoke with /use cc, /use cc-next, /use cc-view, /use cc-ops, /use cc-ideate, or by typing /cc … in plain text — Codex auto-matches from the skill description) and writes .codex/hooks.json so session-start rehydration, stop-checkpoint, prompt-guard, workflow-guard, and context-monitor fire automatically — as long as you enable the codex_hooks feature flag in ~/.codex/config.toml. cclaw init --codex asks for consent before patching that file. Codex's PreToolUse/PostToolUse are Bash-only; the stage skills compensate for Write/Edit/MCP tool calls with explicit in-turn checks. Run cclaw doctor to see the current state of hooks, the feature flag, and any legacy layout to clean up.

The full capability matrix lives in docs/harnesses.md. Per-harness playbooks are generated into .cclaw/references/harnesses/ on every install and upgrade; stage skills cite them by path.

Runtime state:

.cclaw/state/harness-gaps.json (schema v2) — per-harness list of missing capabilities, missing hook events, the declared fallback, the playbook path, and a remediation[] list you can act on.
cclaw doctor — asserts every installed harness has its playbook on disk and surfaces the expected fulfillment mode inside the delegation:mandatory:current_stage check.

Eval-driven prompt engineering

cclaw ships with cclaw-cli eval — a three-tier regression harness for the skills and contracts the runtime generates. Use it when you change a stage skill, tweak a prompt, or swap a model.

Works with any OpenAI-compatible endpoint — Zhipu AI GLM, OpenAI, Together, self-hosted vLLM — via three environment variables:

CCLAW_EVAL_API_KEY=...
CCLAW_EVAL_BASE_URL=https://api.z.ai/api/coding/paas/v4   # default
CCLAW_EVAL_MODEL=glm-5.1                                  # default

Full details, corpus format, and the eval contract live in docs/evals.md.

CLI reference

The CLI is deliberately small. Everything operational happens inside your harness.

npx cclaw-cli                   # launches interactive setup (or prints
                                # a one-line status hint if already installed)
npx cclaw-cli upgrade           # refresh generated files; preserves .cclaw/config.yaml
npx cclaw-cli uninstall         # remove .cclaw + generated harness shims
npx cclaw-cli eval …            # maintainer surface (see docs/evals.md)
npx cclaw-cli --version

For CI or scripted installs, cclaw-cli init --harnesses=<list> --no-interactive is the non-interactive form. All other tunables (prompt-guard strictness, TDD enforcement, language rule packs, track heuristics) are set by editing .cclaw/config.yaml directly — see docs/config.md for the full key reference.

License

MIT

cclaw-cli