Package Exports

@rahadiana/opencode-ultrapress

Readme

🚀 OpenCode UltraPress

Token Compression Plugin for OpenCode AI

UltraPress saves context window tokens through 4 compression layers that run automatically in the background — from CLI output filtering, semantic compression, dynamic context pruning, to auto-cleanup. Your LLM stays smart, tokens stay lean.

📑 Table of Contents

⚡ Installation & Setup
🛠 4-Layer Architecture
⚙️ Configuration
- Full documentation →
⌨️ /up Slash Command
- Sub-command List
- Example Output
❷ MLM & NLP Support
🏗 Code Architecture
🧪 Testing
📊 Benchmark
🚀 Local Development
❓ FAQ & Troubleshooting
🗺 Roadmap
🤝 Contributing
📝 Changelog
📄 License

⚡ Installation & Setup

System Requirements

Dependency	Minimum Version	Notes
Node.js	`>= 18`	Node 22 LTS recommended
OpenCode AI	Latest	Uses `@opencode-ai/plugin ^1.14`
Git	Any	Required for GitHub install
Bun	Latest	For development/testing only
@huggingface/transformers	Auto-install	Only used when `mlm` mode is active

Compatibility Matrix (Runtime)

Environment	NLP (default)	MLM	LLM	Notes
macOS (Intel/Apple Silicon)	✅	✅	✅	Best overall support
Linux	✅	✅	✅	Good for server/workstation
WSL2	✅	✅	✅	Prefer enough RAM for MLM/LLM
Windows	✅	✅	✅	Works via Node runtime
Termux / low-resource mobile shell	✅	⚠️	⚠️	Prefer NLP mode to avoid model load overhead

Legend: ✅ recommended · ⚠️ possible but resource-sensitive

Cross-Platform Notes

UltraPress is pure TypeScript and works across all platforms OpenCode supports:

macOS
Linux
WSL / Termux
Windows

What to expect:

mlm / llm modes may download model assets on first use (larger disk/RAM footprint).
If your environment is resource-limited (e.g., small VPS/Termux), use default Balanced (NLP) or force semantic.mode: "nlp" for zero-model operation.

1. Install the Plugin

# ✅ RECOMMENDED — auto-registers with OpenCode
opencode plugin add @rahadiana/opencode-ultrapress@latest --global

⚠️ Important: OpenCode caches the plugin at ~/.cache/opencode/packages/. @latest is resolved once on first install — subsequent versions won't auto-update. To upgrade:
rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest
opencode plugin add @rahadiana/opencode-ultrapress@latest --global
The plugin will warn you at startup if a newer version is available.

Alternative install methods (not recommended for end users):

# Via npm — requires manual registration (step 2), same cache caveat applies
npm install -g @rahadiana/opencode-ultrapress

# GitHub latest — for testing pre-release changes
npm install -g github:rahadiana/opencode-ultrapress

For plugin development:

git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build

# Link globally so OpenCode can find it
npm link

# Then in OpenCode's config (~/.config/opencode/config.json), add:
# { "plugins": ["@rahadiana/opencode-ultrapress"] }
#
# After making code changes, re-run:
# npm run build
#
# No need to re-link — OpenCode loads from the linked directory.
# If using opencode plugin add, remove the cached version first:
# rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest

2. Register to OpenCode (npm install only)

If you installed via opencode plugin, skip this step — registration is automatic.

Add the plugin to OpenCode's config at ~/.config/opencode/config.json:

{
  "plugins": ["@rahadiana/opencode-ultrapress"]
}

3. (Optional) Create Personal Configuration

UltraPress works out-of-the-box with Balanced defaults. For customization:

# Install from GitHub → copy from the cloned repo
cp ultrapress.plugin.jsonc.example ~/.config/opencode/ultrapress.plugin.json

# Or from global install
cp $(npm root -g)/@rahadiana/opencode-ultrapress/ultrapress.plugin.json.example ~/.config/opencode/ultrapress.plugin.json

Then edit ~/.config/opencode/ultrapress.plugin.json as needed. If the file is not found, UltraPress will automatically create it with default values on first run.

Quick-Start Profiles by Device

Use this as a practical starting point:

Device / Environment	Recommended Mode	Why
Low-RAM machine / Termux / tiny VPS	`semantic.mode: "nlp"`	Zero model download, lowest RAM/CPU overhead
Typical laptop/dev machine	Balanced defaults	Best trade-off: token savings + context safety
High-RAM workstation	`mlm` or `llm` (optional)	Higher semantic quality, more resource usage

Minimal override examples:

// Low-resource profile
{
  "semantic": { "mode": "nlp" }
}

// Higher-quality semantic profile (requires more RAM)
{
  "semantic": {
    "mode": "mlm",
    "model": "Xenova/all-MiniLM-L6-v2"
  }
}

Verify Installation

Restart OpenCode, then type in chat:

/up stats

If a statistics dashboard appears, UltraPress is active. If not:

Installed via npm install? Make sure the plugin is in config.json → "plugins": ["@rahadiana/opencode-ultrapress"]
Installed via opencode plugin? Run opencode plugin list to confirm it's registered.
Check OpenCode logs for errors
Ensure Node.js >= 18 is installed (node --version)

Uninstall

# 1. Remove from OpenCode plugin list
#    Edit ~/.config/opencode/config.json — remove "@rahadiana/opencode-ultrapress" from plugins array

# 2. Purge the cached version
rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest

# 3. If installed via npm (global)
npm uninstall -g @rahadiana/opencode-ultrapress

# 4. If using npm link
npm unlink -g @rahadiana/opencode-ultrapress

# 5. Clean up config
rm ~/.config/opencode/ultrapress.plugin.json

🛠 4-Layer Architecture

UltraPress intercepts OpenCode message flow at 4 different points, each with a specific compression strategy.

Pipeline Flow

flowchart LR
    A([Tool Output]) -->|"tool.execute.after"| L1[Layer 1\nOutput Filter]
    B([Chat Message]) -->|"chat.message"| PRUNE{Prune\nPending?}
    PRUNE -->|yes| REMOVE[Remove msgs\n& inject summary]
    PRUNE -->|no| L2[Layer 2\nGSC Semantic]
    REMOVE --> L2
    L2 --> L3[Layer 3\nNudge Monitor]
    L3 -->|"nudge injected"| LLM{LLM}
    LLM -->|"calls tool"| DCP_TOOL[ultrapress_compress]
    DCP_TOOL -.->|"block stored"| PRUNE
    D([Session Compact]) -->|"session.compacting"| L4[Layer 4\nCleanup]

    L1 & L2 & L4 --> CTX[(Context\nWindow)]

Layer 1 — Smart Output Filter

Hook: tool.execute.after · File: layer1-output-filter.ts · Filters: src/filters/

Intercepts CLI tool output before it enters the context window. The most aggressive layer — directly cuts unnecessary logs.

Core Strategies:

Strategy	Description
Domain Routing	Each CLI tool is routed to a specific filter: `git`, `npm/node`, `pytest/jest`, and filesystem. Unknown tools go to the generic filter.
Middle-out Truncation	Truncates logs from the middle, preserving the beginning (context) and end (error/result). Smarter than head/tail truncation.
Deduplication	Removes identical repeated log lines in real-time. Very effective for build & test logs.
Tee Save	If output is truncated, the original log is saved to a temporary `.log` file so it can still be accessed if needed.

Built-in Filters:

Filter	File	Trigger Tools
Git	`filters/git.ts`	`git diff`, `git log`, `git show` — remove redundant diff hunks, keep summary
Test	`filters/test.ts`	`pytest`, `jest`, `vitest`, `mocha` — summarize failure output, remove passing tests
Bash	`filters/bash.ts`	Generic shell output — dedup lines, middle-out truncation
Filesystem	`filters/fs.ts`	`ls`, `cat`, `find` — limit file count, truncate long content
Generic	`filters/generic.ts`	Fallback for all other tools — middle-out truncation + dedup

Layer 2 — GSC Semantic Compression

Hook: chat.message · File: layer2-caveman.ts · Engine: src/caveman/

Compresses message text semantically — removes unimportant words without changing meaning. Layer 2 and Layer 3 do not compress each other — no double compression.

Compression Rules:

Conjunctions (that, and, will, which) → removed
Excessive pronouns → condensed
Redundancy ("I think I will" → "I will") → removed
Double spaces, unnecessary whitespace → normalized
Code blocks (inside ```) → NEVER touched
Error messages & stack traces → fully protected
Messages < 200 characters → skipped

Operating Modes:

See ❷ MLM & NLP Support for detailed NLP vs MLM comparison.

Layer 3 — Dynamic Context Pruning (DCP)

Hook: chat.message (pruning) + tool.execute.after (compress tool) · File: layer3-dcp.ts · Engine: src/dcp/

The most advanced system: gives LLM autonomy to manage its own memory. Unlike Layer 2 which only compresses text, Layer 3 actually removes old messages from the context window and replaces them with summaries.

Mechanism:

1. Context Monitor → detect tokens approaching maxContextLimit
2. Autonomous Nudge → inject prompt into user message: "context window nearly full, call ultrapress_compress"
3. LLM calls → ultrapress_compress(mode="range", from=<id>, to=<id>)
4. Compression Block → stored in memory (compress-state.ts)
5. chat.message hook → check pending blocks → remove messages in range → inject summary as synthetic message

Key Features:

Feature	Description
Block-based Pruning	When `ultrapress_compress` is called, LLM determines the message range to summarize. Block is stored, then executed on the next chat (not current — avoids race condition).
prune via `chat.message` Hook	Each new message → plugin checks pending blocks → removes messages in range from context array → injects summary.
Protected Content	Critical tool output (`task`, `skill`, `todowrite`, `todoread`, `write`, `edit`, `ultrapress_compress`) is protected from pruning.
Marker Protection	Messages containing `TODO`, `FIXME`, `HACK`, `ACTION ITEM`, `ROOT CAUSE`, `RCA`, `DECISION`, `BLOCKER` are preserved from pruning.
Nesting Support	Compression can be done on top of previous compression. Nested summaries are auto-merged.
`preserveLastN`	Protects the last N messages from pruning — keeps recent conversation context intact. Default: `4`. Set to `0` to disable.
Multi-Signal Scoring	In addition to `preserveLastN`, each message is scored from 5 signals (recency, role, tool type, keyword, content size). High-scoring messages are preserved even in old blocks. Default: `0.45` (balanced).
Reversible Compression	`ultrapress_expand` tool — LLM can "expand" previously summarized blocks to see the original content. Original content is stored in plugin memory (not in LLM context).
Nudge @70%	Nudge is sent when context reaches 70% limit (not 100%), giving LLM time to compress before context is truly full.
summaryBuffer	After pruning, provides breathing room (no immediate re-nudge).

Two Pruning Modes:

Mode	Description
`range`	Range-based compression: choose from_id and to_id. All messages in between are summarized into one.
`message`	Surgical compression: choose one or more specific message IDs to summarize.

Layer 4 — Session Auto-Cleanup

Hook: tool.execute.after + session.compacting · File: layer4-cleanup.ts

Automatically cleans "garbage" from the context window.

Features:

Feature	Description
Error Purging	Removes error/failed tool messages after N chat turns (default: 4 turns). Stale errors only waste tokens.
Tool-Call Dedup	Prevents LLM from repeating identical tool calls (same tool + args) in the same session.

⚙️ Configuration

📖 Full configuration documentation (all keys, types, defaults, examples, presets, custom filters, troubleshooting) at: docs/konfigurasi-lengkap.md

Basic Structure

File: ~/.config/opencode/ultrapress.plugin.json

{
  "enabled": true,           // Master switch
  "notification": "minimal", // "off" | "minimal" | "detailed"

      "outputFilter": {},
      "semantic": {},
      "summarization": {},
      "cleanup": {}
}

Layer	Key	Function
L1	`outputFilter`	Limit CLI output length, filter repetitive lines
L2	`semantic`	NLP/MLM text compression without destroying meaning
L3	`summarization`	Remove old messages, replace with summary, `preserveLastN` protection
L4	`cleanup`	Dedup tool calls, auto-purge stale errors

🔒 Safety guard: task is always enforced in outputFilter.skipTools and semantic.skipTools even if removed in user config. This prevents accidental sub-agent context loss.

👉 Open full documentation → covers all keys, types, defaults, custom filters, presets (Balanced / Aggressive / Conservative / NLP-only), and troubleshooting.

⌨️ `/up` Slash Command

All interactions with UltraPress via a single command: /up.

Sub-command List

Command	Alias	Description
`/up stats`	`s`, `stat`	Current session token savings dashboard
`/up context`	`c`, `ctx`	Context window status: capacity, limit, remaining
`/up compress`	`comp`	Show layer status + compression guide
`/up help`	`h`, `?`	Command help

Note: Sub-commands are case-insensitive and support partial fuzzy matching.

Example Output

/up stats:

📊 ULTRAPRESS STATS
──────────────────────────────────────────
  Raw tokens       : 127,450
  Compressed tokens: 89,215
  Tokens saved     : 38,235 (30.0%)

  By Layer:
  L1 Output Filter : 18,400
  L2 Semantic      : 12,100
  L3 Summarization :  5,835
  L4 Cleanup       :  1,900

  Activity:
  Compressions : 3
  Deduplications: 12
  Errors purged: 2

  Session: 2h 15m
──────────────────────────────────────────

/up context:

🧠 CONTEXT STATUS
──────────────────────────────────────────
  Current tokens     : ~52,000
  Max context limit  : 70,000
  Available          : ~18,000 (25.7%)
  Nudge threshold    : 40,000
  Status             : 🟡 Nearing limit (nudge will fire soon)
  Next nudge in      : 3 turns
──────────────────────────────────────────

❷ MLM & NLP Support

NLP Mode (Default)

Rule-based grammar stripping using linguistic rules. Zero latency, no external model required.

How it works:

Detect sentence structure (subject, predicate, object)
Remove conjunctions, excessive pronouns, filler words
Condense redundancy without changing meaning
Protect code blocks & error messages

MLM Mode (Experimental)

Uses Masked Language Model via @huggingface/transformers (Transformers.js) for more accurate tokenization.

Activation:

{
  "semantic": {
    "mode": "mlm",
    "model": "Xenova/distilbert-base-uncased"
  }
}

Important Notes:

⚠️ Model auto-downloaded on first run (~70MB for distilbert-base)
⚠️ First-run latency 5-15 seconds for model loading
⚠️ RAM usage increases ~200MB when model is active
⚠️ Compatibility: CPU-only (no GPU required)
🌐 For Indonesian: use Xenova/bert-base-multilingual-uncased

Mode Comparison

Aspect	NLP	MLM	LLM
Latency	< 1ms	50-200ms	1-5s
RAM	0 MB	~70 MB (q8)	~300 MB (q8)
Accuracy	~85%	~95%	~99%
Language	Indonesian + English	100+ languages	All
Internet Connection	❌ Not needed	❌ Initial download only	❌ Initial download only
Stable	✅	⚠️ Experimental	⚠️ Experimental
Model	—	`all-MiniLM-L6-v2`	`t5-small` (summarization)

🏗 Code Architecture

Directory Structure

opencode-ultrapress/
├── src/
│   ├── index.ts                    # Entry point, hook registration, plugin server
│   ├── config/
│   │   ├── schema.ts               # TypeScript type definitions (UltraPressConfig, etc.)
│   │   └── defaults.ts             # Default config values + merge logic
│   ├── layers/
│   │   ├── layer1-output-filter.ts # RTK engine — routes tool output to domain filters
│   │   ├── layer2-caveman.ts       # Semantic compression orchestrator
│   │   ├── layer3-dcp.ts           # DCP orchestrator — nudge injection, pruning trigger
│   │   └── layer4-cleanup.ts       # Auto-cleanup — dedup + error purging
│   ├── filters/
│   │   ├── git.ts                  # Git-specific output filter
│   │   ├── test.ts                 # Test runner output filter
│   │   ├── bash.ts                 # Shell output filter
│   │   ├── fs.ts                   # Filesystem tool output filter
│   │   └── generic.ts              # Fallback filter
│   ├── dcp/
│   │   ├── compress-state.ts       # In-memory state for pending compression blocks
│   │   ├── compress-tool.ts        # ultrapress_compress tool definition & handler
│   │   ├── context-monitor.ts      # Token usage monitoring + nudge logic
│   │   ├── prune.ts                # Message removal + summary injection engine
│   │   ├── protected-content.ts    # Defines which tool outputs are protected
│   │   └── summary-store.ts        # Stores summaries for nesting support
│   ├── caveman/
│   │   ├── nlp.ts                  # Rule-based NLP compressor
│   │   └── mlm.ts                  # MLM-based compressor (Transformers.js)
│   ├── commands/
│   │   └── slash.ts                # /up slash command handler
│   └── utils/
│       ├── token-count.ts          # Token estimation (char-based approximation)
│       └── logger.ts               # Logging with configurable verbosity
├── tests/
│   ├── layer1.test.ts              # Output filter unit tests
│   ├── layer2.test.ts              # Semantic compression unit tests
│   └── layer3-dcp.test.ts          # DCP pruning + nudge unit tests
├── benchmarks/
│   ├── run.ts                      # Benchmark runner
│   └── fixtures/                   # Benchmark test data
├── docs/
│   └── image/
│       └── banner.svg              # README banner
├── ultrapress.plugin.jsonc.example        # Configuration template (JSONC)
├── tsconfig.json                   # TypeScript config
├── tsup.config.ts                  # Build config (tsup)
├── package.json
├── CHANGELOG.md
├── LICENSE
└── README.md

Hook Registration Map

OpenCode Hook	Trigger	UltraPress Handler	Layer
`tool.execute.after`	After CLI tool completes	Output filtering + token tracking + dedup	L1, L4
`chat.message`	Before user message is sent to LLM	Pruning pending blocks + semantic compression + nudge injection	L2, L3
`command.execute.before`	User types `/up`	Slash command handler	—
`experimental.session.compacting`	OpenCode compacting session	Protected context injection	L4
`config`	Plugin initialization	Register `/up` command	—
`tool` (definition)	Plugin init	Register `ultrapress_compress` tool	L3

Data Flow Detail

1. Plugin Init
   config hook → register /up command
   tool definition → register ultrapress_compress
   load/migrate config from ~/.config/opencode/ultrapress.plugin.json

2. Tool Execution (every tool call)
   tool.execute.after → L1 processToolOutput()
     → Domain routing (git→git.ts, test→test.ts, etc.)
     → Middle-out truncation
     → Deduplication
   → L4 applyCleanup()
     → Dedup check
     → Error registration for purge

3. Chat Message (every user message)
   chat.message → L3 check pending compression blocks
     → applyPruning() — remove old messages, inject summaries
   → L2 processMessageContext() — semantic compression
   → L3 context monitor — check token count
     → if near limit → inject nudge prompt
   → L3 turnTick() — update turn counter

4. LLM calls ultrapress_compress
   tool.execute.after → compress tool handler
     → Create CompressionBlock in compress-state.ts
     → Store summary for nesting
   → Block will be executed on NEXT chat.message

5. Session Compacting
   session.compacting → L4 protected context injection

🧪 Testing

Run the entire test suite:

bun test

Test File	Coverage	Layer
`tests/layer1.test.ts`	Output filtering: domain routing, truncation, tee save, dedup	L1
`tests/layer2.test.ts`	Semantic compression: NLP grammar stripping, code block protection, min length skip	L2
`tests/layer3-dcp.test.ts`	DCP: pruning with preserveLastN, nudge frequency, nesting summaries, protected content	L3

# Run specific layer
bun test tests/layer1.test.ts
bun test tests/layer2.test.ts
bun test tests/layer3-dcp.test.ts

# Run with TypeScript type checking
bun run lint

📊 Benchmark

Run the full benchmark to measure the effectiveness of all 4 layers:

npm run benchmark

Latest Benchmark Results

┌────────────────────────────────┬──────────────────────────────────────────┬────────────┬──────────────┬──────────┐
│Fixture                         │Layer                                     │Original    │Compressed    │Savings   │
├────────────────────────────────┼──────────────────────────────────────────┼────────────┼──────────────┼──────────┤
│git-diff-large.txt              │L1 — Git Filter                           │       1,657│           969│       42%│
│npm-install-log.txt             │L1 — Generic Filter                       │         431│           430│        0%│
│pytest-log.txt                  │L1 — Generic Filter                       │       1,200│         1,199│        0%│
│chat-history.json               │L2 — NLP Semantic                         │         625│           490│       22%│
│dcp-conversation.json           │L3 — DCP Pruning (14→summary)             │       2,347│           645│       73%│
│                                │  ↳ 10 msg removed, 1 summary injected    │            │              │          │
│3x identical npm test           │L4 — Tool Call Dedup                      │       2,244│           854│       62%│
│                                │  ↳ 2 duplicates collapsed                │            │              │          │
│5 errors × 6 turns              │L4 — Error Auto-Purge                     │         845│             0│      100%│
│                                │  ↳ 5 errors purged after threshold       │            │              │          │
└────────────────────────────────┴──────────────────────────────────────────┴────────────┴──────────────┴──────────┘

✅ Total: 9,349 → 4,587 tokens (51% overall savings)

Layer Summary

Layer	Fixture	Avg Savings	Characteristics
L1 Output Filter	3 fixtures	21%	Most effective for verbose CLI logs (`git diff`: 42%). Short output less affected.
L2 Semantic NLP	1 fixture	22%	Consistently compresses natural language without destroying meaning. Code blocks fully protected.
L3 DCP Pruning	1 fixture	73%	Biggest saver — removes 10 old messages & replaces with 1 summary. Compounding effect in long sessions.
L4 Auto Cleanup	2 fixtures	72%	Dedup saves 62% from repeated tool calls. Error purge 100% after threshold.

💡 Insight: L3 (DCP) is the layer with the highest savings because it removes old messages in bulk. In long sessions (100+ messages), the cumulative effect of L3 + L4 can reach 70-90% token savings. Dataset and scripts are in benchmarks/ — contribute fixtures from your stack for more representative results.

🚀 Local Development

# 1. Clone repository
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress

# 2. Install dependencies
npm install

# 3. Build TypeScript
npm run build          # tsup — compile to dist/

# 4. Development mode (watch)
npm run dev            # tsup --watch

# 5. Run tests
npm test               # bun test

# 6. Type checking
npm run lint           # tsc --noEmit

# 7. Benchmark
npm run benchmark      # tsx benchmarks/run.ts

Development Workflow:

Edit files in src/
npm run dev for auto-rebuild
npm test to verify
Restart OpenCode to reload plugin
Test via /up stats in chat

❓ FAQ & Troubleshooting

Plugin doesn't appear after install

Ensure the plugin is registered in ~/.config/opencode/config.json:
```
{ "plugins": ["@rahadiana/opencode-ultrapress"] }
```
Restart OpenCode completely (not just reload window)
Check if the package is installed: npm list -g @rahadiana/opencode-ultrapress

Error "Cannot find module @huggingface/transformers"

MLM mode requires an additional dependency. Install manually:

npm install -g @huggingface/transformers

Or switch to "nlp" mode which does not require external dependencies.

OpenCode feels slow after install

Check semantic mode: "mode": "mlm" — MLM model loading at startup can be slow. Switch to "nlp" for zero latency.
Check notification level: "detailed" prints many logs. Set to "minimal".
Ensure minLengthChars is not too low (default 250 is optimal).

My important messages were deleted by pruning

Increase preserveLastN (default 4 → try 6 or 7)
Important tool output is automatically protected (task, skill, todowrite, todoread, write, edit)
Decision markers are also protected (TODO, FIXME, ACTION ITEM, ROOT CAUSE, DECISION, BLOCKER)
If still deleted, report as a bug with detailed logs

How do I disable a specific layer?

Set "enabled": false on the layer you want to turn off:

{
  "semantic": { "enabled": false },
  "summarization": { "enabled": false }
}

TypeScript error during development

Ensure dependencies are installed:

npm install
npm run lint    # tsc --noEmit to check for type errors

🗺 Roadmap

Feature	Status	Target
Layer 1: Domain-aware output filtering	✅ Done	v0.1.0
Layer 2: NLP semantic compression	✅ Done	v0.1.0
Layer 2: MLM mode	⚠️ Experimental	v0.2.0
Layer 2: LLM mode (local summarization)	✅ Done	v0.2.0
Layer 2: All-pairs MLM dedup	✅ Done	v0.2.0
Layer 3: Block-based DCP pruning	✅ Done	v0.1.0
Layer 3: `preserveLastN` protection	✅ Done	v0.1.0
Layer 3: Multi-signal importance scoring	✅ Done	v0.2.0
Layer 3: Reversible compression (`ultrapress_expand`)	✅ Done	v0.2.0
Layer 3: Pre-emptive nudge @70%	✅ Done	v0.2.0
Layer 3: Surgical message pruning	✅ Done	v0.1.0
Layer 4: Error purging & dedup	✅ Done	v0.1.0
`/up` slash commands	✅ Done	v0.1.0
Real token tracking (OpenCode API)	✅ Done	v0.2.0
Custom filter API	✅ Done	v0.1.0
TF-IDF scoring (MLM improvement)	🚧 Planned	v0.2.0
Sentence similarity (MLM improvement)	🚧 Planned	v0.2.0
Sub-agent (`task`) token tracking & compression	💡 Idea	TBD
UI stats dashboard in OpenCode	💡 Idea	TBD
Support more languages (NLP)	💡 Idea	TBD

🤝 Contributing

Contributions are welcome! Areas most in need of help:

New Filters: Add Layer 1 filters for new frameworks/stacks (Kubernetes, Docker, Terraform, Svelte, Flutter, etc.).
MLM Roadmap: Help implement actual TF-IDF scoring or sentence similarity.
Benchmark Dataset: Contribute fixture data from your tech stack.
Multi-language NLP: Expand grammar stripping rules for more languages.
Bug Reports: Report edge cases — poorly filtered tool output, important messages deleted, etc.

Development Setup

git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build
npm test
npm run benchmark

Pull Request Process

Fork repository
Create feature branch (git checkout -b feature/amazing-filter)
Commit changes (git commit -m 'Add amazing filter')
Push to branch (git push origin feature/amazing-filter)
Open Pull Request — ensure bun test and npm run lint pass

📝 Changelog

See CHANGELOG.md for the full version history.

📄 License

UltraPress — Because tokens are expensive, but context is priceless. ❤️

@rahadiana/opencode-ultrapress