Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (altimate-receipts) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

🧾 receipts

Agent-work verification — not code review.

A deterministic, cross-agent Report Card of what your coding agent actually did — read straight from the agent's own transcript, on your machine.

# 30-second start — no install, no account:
npx altimate-receipts          # Report Card for your most recent agent session

Adding it to a repo's PRs? receipts init scaffolds the CI check, or see docs/onboarding-internal.md — install, the pre-push hook, and the one-block PR check.

  ╔══════════════════════════════════════════════════════════════════════════╗
  ║ 🧾  RECEIPTS — Agent Report Card                        proof, not vibes ║
  ╚══════════════════════════════════════════════════════════════════════════╝

  Session  Add rate limiting to the billing service
  Agent    claude-code · claude-opus-4-8
  Scope    1h 4m · 412 msgs · 318 tools · 6M tok · $24.18

  ┌─ VERDICT ────────────────────────────────────────────────────────────────┐
  │   F     ⛔ DO NOT MERGE WITHOUT REVIEW                                   │
  │ 2 critical · 3 high · 1 medium                                           │
  └──────────────────────────────────────────────────────────────────────────┘

  CRITICAL
   ⛔ Destructive op: rm -rf ./dist ./build ×2
      data-loss risk
   ⛔ Modified the grader/harness: score.py
      gamed the eval · score.py
  HIGH
   ⚠️  Force-pushed over remote history
      history overwrite
   ⚠️  Weakened the checker config: tsconfig.json
      checker defanged · tsconfig.json
   ⚠️  Stuck loop wasted $3.10 / 7m
      $3.10 · 7m
  MEDIUM
   🔍 6 turns drove 57% of the $24.18 spend
      $13.78

  EVIDENCE
   58 files changed · 146 edits · 92 commands · tests ran ✓ · 4 destructive ops · cache 0%

  ──────────────────────────────────────────────────────────────────────────
  ✅ Verified by Receipts  ·  deterministic  ·  0 model calls  ·  evidence, not judgement
  what it did — not whether it's correct. your tests are the oracle for success.

Illustrative output. Run receipts on your own Claude Code sessions to see the real card.

The problem

Agents now write most of the code. It is no longer humanly possible to read every line of the agent's output — the diff — and a long session hides far more than a diff shows: commands run, files touched then reverted, tests that "passed," destructive ops, quietly weakened checkers.

So review the agent's work, not its output. Instead of re-reading every line, you read a faithful, deterministic account of what the agent actually did — and spend your attention where the account flags something.

This is a new category: agent-work verification. It is not code review and never grades code quality. Your tests remain the oracle for correctness; Receipts is the oracle for what happened.

What a Receipt is

receipts reads the agent's own local transcript and prints a deterministic, file:line-cited account of what the agent did:

What changed — files, edits, the actual bodies.
What ran — commands, and whether the tests actually ran.
What it touched that it shouldn't have — destructive ops, history rewrites, test/eval tampering, and other research-backed reward-hacking patterns.
Claims vs. evidence — what the agent said it did, checked against the transcript.

Every finding is file:line-cited or it doesn't ship. Nothing leaves your machine.

→ What problems Receipts solves — the before/after for each capability, shipped and planned.

Why you can trust it

The whole pitch collapses if the account itself is wrong, so trust is engineered in:

Deterministic — zero model calls. Findings come from regex/heuristic rules over the transcript. Same input → same output. There is no LLM in the product path to hallucinate.
Near-zero false positives, now measured. A single false alarm and you go back to reading diffs, so this is treated as existential and tested, not asserted: precision 100% on a 70-session labeled corpus, and a 1% flag rate over 1,200 real local sessions in the out-of-sample field scan — every flag manually adjudicated as a true positive on genuine work, zero confirmed false positives. See docs/eval.md.
Evidence, not judgement. Receipts reports what an agent did, never whether the code is good.
Local-first. The CLI runs entirely on your machine — no account, no upload, no telemetry. (The optional CI Action can emit anonymous, aggregate counts only; off by default in releases. See telemetry.)

How it works

the agent's own local transcript (raw JSONL / SQLite)
  → adapter (per agent)   normalize to one session model, tool calls passed raw
  → spans                 derive edits / commands / reads / cost / destructive ops
  → findings              deterministic detectors → file:line-cited findings
  → Report Card           human-readable, severity-grouped (default output)
  → Receipt (--json)      portable, schema-validated object; optionally signed

Receipts reads the transcript your agent already writes to disk — it doesn't sit between you and the model, doesn't need an API key, and doesn't phone home. One Report Card spans Claude Code, Codex, Cursor, and OpenClaw; detectors are agent-agnostic.

Invariants

These are binding constraints, not aspirations (full set in SPEC-0000):

Deterministic — zero model calls in the product path. (R1)
Evidence, not judgement — never grades code quality. (R2)
Near-zero false positives, measured — see docs/eval.md. (R3)
Local-first — no upload; the opt-in CI Action emits only anonymous aggregate telemetry. (R4)

Usage

receipts                  # Report Card for your most recent session (any agent)
receipts --list           # list recent sessions across all agents
receipts --agent codex    # limit to one agent: claude-code | codex | cursor | openclaw
receipts 3                # the 3rd session from --list
receipts "billing"        # first session whose title contains "billing"
receipts --json           # emit the Receipt object (in-toto Statement)
receipts --json --compact # canonical (sorted, minified) JSON
receipts --share          # redacted, paste-ready Markdown summary
receipts guardrails --last 5  # prevention rules for AGENTS.md from recent sessions
receipts trends           # cross-session digest: grades, recurring findings, cost
receipts trends --last 20 # span a wider window (default 10)
receipts pr               # write THIS branch's receipt (branch-scoped) to .receipts/
receipts verify <bundle> --transcript <t>   # prove the receipt is faithful (L1)
receipts diff <a> <b>     # what changed between two receipts (deltas; --json)
receipts log              # list the committed receipts in .receipts/ (--last N)
receipts stats            # dogfooding scoreboard: how often it ran + what it caught
receipts eval             # flag-rate of the detectors over your real local sessions
receipts badge [receipt]  # shields.io endpoint JSON for a README/PR badge
receipts sarif [receipt]  # SARIF 2.1.0 for GitHub code-scanning (inline + Security tab)
receipts init             # scaffold the PR-check workflow into this repo (1-command adopt)
receipts rederive <t>     # reproduce the canonical receipt from a transcript
receipts mcp              # start the MCP server (stdio) for IDEs/agents
receipts --no-color       # plain text (also honors NO_COLOR)

Most people only ever need the Report Card. The Receipt (receipts --json) is there when you want a portable, vendor-neutral record to feed tooling — an in-toto Statement carrying the same deterministic evidence + findings (schema).

Status

🚧 Early. Working today across Claude Code, Codex, Cursor, and OpenClaw:

the deterministic Report Card (receipts) — the core,
prevention rules you can paste into AGENTS.md (receipts guardrails),
a cross-session trends digest — what your agent does wrong over time (receipts trends),
a portable Receipt (receipts --json) + a redacted --share summary,
an MCP server (receipts mcp) for IDEs/agents (docs).

The findings engine is measured for fidelity — precision 100% on a 70-session labeled corpus, 1% flag rate over 1,200 real local sessions (every flag adjudicated as a true positive; zero confirmed false positives) (docs/eval.md). See the roadmap in specs/.

For teams (optional)

Need to attach the Receipt to a PR or prove it in CI? A Receipt can be Sigstore-signed and posted as a "Verified-by: Receipts" check (docs), and re-derived from its transcript to prove it wasn't hand-edited (receipts verify --transcript, trust model). This is opt-in — the value is the Report Card; signing is there for when your org's compliance or settlement needs make it concrete.

Configure it locally (one prompt)

Your repo already has the Verified by Receipts check, but Receipts is local-first — the receipt is generated from your agent's transcript and committed to the branch; CI can't make it. So each contributor configures it once. Paste this prompt into your coding agent (Claude Code) from inside the repo, and it will set up local generation and verify the whole flow end to end — including that the receipt actually posts on the PR:

Set up and verify altimate-receipts in this repo, end to end.

1. Make the `receipts` CLI available. Try `receipts --version`. If missing, try
   `npx --yes altimate-receipts --version`. If that also fails (private / not on npm),
   clone + build it and use it via node:
     git clone --depth 1 https://github.com/AltimateAI/altimate-receipts /tmp/altimate-receipts
     (cd /tmp/altimate-receipts && npm ci && npm run build)
   Then treat `receipts` as `node /tmp/altimate-receipts/dist/cli.js` everywhere below.

2. Configure it locally so every future push auto-attaches my receipt: create a
   `.githooks/pre-push` hook that runs `receipts pr`, and if it changes `.receipts/`,
   `git add .receipts/` + commit, then asks me to push again. Make it executable and run
   `git config core.hooksPath .githooks`. (Never let the hook fail the push.)

3. Generate THIS branch's receipt now from my local Claude Code transcript:
   `receipts pr --base <this repo's default branch>`  → writes `.receipts/<branch>.json`.

4. Verify it is authentic and re-derivable: `receipts verify .receipts/<branch>.json` must pass.

5. Commit it: `git add .receipts/<branch>.json && git commit -m "chore: attach agent receipt"`.

6. Confirm uploading works end to end: push the branch and open/refresh the PR, then check that
   the "Verified by Receipts" CI check runs AND a Receipts comment appears
   (`gh pr checks` and `gh pr view --comments`). Report back whether the receipt is showing on
   the PR, and paste the comment's headline line.

If step 6 shows the check + comment, uploading works. If the comment is missing, the agent will report why (no receipt committed, workflow not present, or a permissions issue) — the same classification the fleet check (script/dogfood-health.mjs) uses.

Dogfooding

Receipts audits its own development: open this repo in your coding agent and receipts pr attaches a branch-scoped Receipt to each PR (a pre-push hook and AGENTS.md automate it).

How we build: spec-driven

Every change starts from a spec in specs/ (use specs/TEMPLATE.md). The product vision is SPEC-0000. See CONTRIBUTING.md.

altimate-receipts