Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (claudecompress) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
ClaudeCompress
Shrink Claude Code sessions so cold /resume costs less.
bunx claudecompress # bun
npx claudecompress # npmNo install required — bunx/npx fetches the latest release from npm on demand. To keep it around:
bun add -g claudecompress # or: npm i -g claudecompress
claudecompressInteractive CLI: picks a project, shows each session's size + cache staleness + estimated cold-resume cost in USD, lets you trim it. Your source .jsonl is never modified — a new session file is written alongside it with a fresh UUID.
When to run it
| Situation | Use it? |
|---|---|
Back after a break (5+ min), big session, about to /resume |
✅ yes |
Right after you type /clear |
✅ yes — cache is about to go cold anyway |
| Actively mid-session, cache warm | ❌ no — you'd invalidate the live cache |
| Small session (< 100k tokens) | ⚪ skip — not worth it |
The CLI flags cache state per-session (warm, cold, very-cold) from JSONL mtime and warns before trimming a warm one.
What it does
Six modes, pick at runtime:
| Mode | Weight | Behavior |
|---|---|---|
| Redact (default) | medium | drop all tool_result bodies, keep full structure |
| Recency N | medium | keep last N turns verbatim (tool_results and all), redact older |
| Focus N | medium–heavy | keep last N turns verbatim + dialog-only trail for everything before |
| Smart | light | per-tool rules — head/tail for Read/Bash, keep Edit/TodoWrite, redact WebFetch / MCP Playwright |
| Ultra | heavy | user + assistant text turns only; tool calls, results, thinking all dropped |
| Truncate N | manual | keep first N chars of every tool_result |
Redact is the safe default — keeps structure and tool names intact, only drops bulky result bodies. Recency is best for "I want to continue working" — full continuation state for the last N turns. Focus is the sweet spot between Ultra and Recency — you keep a dialog-only trail of the whole conversation plus the last N turns fully intact. Great when Recency is still too heavy and Ultra loses too much.
Drop-thinking toggle: any non-Ultra mode can additionally drop thinking blocks. Often 200k+ tokens saved on a long session and thinking is never replayed meaningfully on resume.
Real example on a 35 MB / 760k-token Opus session ($11.41 cold):
| Mode | Size | Tokens | Cost | Saved |
|---|---|---|---|---|
| Redact | 23 MB | 504k | $7.56 | $3.85 |
| Recency 15 | 23 MB | 501k | $7.52 | $3.89 |
| Focus 500 | 2.8 MB | 217k | $3.25 | $8.16 |
| Focus 100 | 1.5 MB | 136k | $2.04 | $9.37 |
| Ultra | 1.2 MB | 125k | $1.88 | $9.53 |
Stack fit
| Tool | Layer | When |
|---|---|---|
| rtk | Bash output compression at ingress | During session |
| context-mode | MCP sandbox + SQLite-backed tool output | During session |
| ClaudeCompress | Retrospective surgery on JSONL | Before cold /resume |
Complementary, not competing. rtk/context-mode prevent new bloat; ClaudeCompress fixes what's already there — including the thinking blocks, Claude's native Read/Grep output, non-ctx MCP responses, and all your pre-existing long sessions.
Install
bunx claudecompressOr from source:
git clone https://github.com/bihanikeshav/ClaudeCompress
cd ClaudeCompress && bun install && bun run src/index.tsAfter trimming, /resume in Claude Code and pick the [TRIMMED by claudecompress] … entry. Send any message (e.g. hi) — /context recomputes and you'll see the drop (typically 50–80%).
History
Every trim is logged to ~/.claude/claudecompress/history.jsonl. See cumulative savings:
bunx claudecompress historyShows recent trims, per-trim savings, and a lifetime total. The intro of the interactive flow also surfaces a one-liner like Lifetime: 7 trims · saved ≈ $42.18.
Pricing per model
Cost estimate at runtime for Opus 4.7 / Opus 4.6 / Sonnet 4.6 / Haiku 4.5 using each model's actual input rate. Token count is a character-based approximation — Claude's tokenizer isn't public, so estimates are within ~10% of real.
License
MIT