JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 74
  • Score
    100M100P100Q111884F
  • License MIT

Token compression plugin for OpenCode AI β€” multi-layer filtering, semantic compression, and dynamic context pruning.

Package Exports

  • @rahadiana/opencode-ultrapress

Readme

UltraPress Banner

πŸš€ OpenCode UltraPress

Token Compression Plugin for OpenCode AI

CI npm License: MIT

UltraPress saves context window tokens through 4 compression layers that run automatically in the background β€” from CLI output filtering, semantic compression, dynamic context pruning, to auto-cleanup. Your LLM stays smart, tokens stay lean.


πŸ“‘ Table of Contents


⚑ Installation & Setup

System Requirements

Dependency Minimum Version Notes
Node.js >= 18 Node 22 LTS recommended
OpenCode AI Latest Uses @opencode-ai/plugin ^1.14
Git Any Required for GitHub install
Bun Latest For development/testing only
@huggingface/transformers Auto-install Only used when mlm mode is active

Compatibility Matrix (Runtime)

Environment NLP (default) MLM LLM Notes
macOS (Intel/Apple Silicon) βœ… βœ… βœ… Best overall support
Linux βœ… βœ… βœ… Good for server/workstation
WSL2 βœ… βœ… βœ… Prefer enough RAM for MLM/LLM
Windows βœ… βœ… βœ… Works via Node runtime
Termux / low-resource mobile shell βœ… ⚠️ ⚠️ Prefer NLP mode to avoid model load overhead

Legend: βœ… recommended Β· ⚠️ possible but resource-sensitive

Cross-Platform Notes

UltraPress is pure TypeScript and works across all platforms OpenCode supports:

  • macOS
  • Linux
  • WSL / Termux
  • Windows

What to expect:

  • mlm / llm modes may download model assets on first use (larger disk/RAM footprint).
  • If your environment is resource-limited (e.g., small VPS/Termux), use default Balanced (NLP) or force semantic.mode: "nlp" for zero-model operation.

1. Install the Plugin

# βœ… RECOMMENDED β€” auto-registers with OpenCode
opencode plugin add @rahadiana/opencode-ultrapress@latest --global

⚠️ Important: OpenCode caches the plugin at ~/.cache/opencode/packages/. @latest is resolved once on first install β€” subsequent versions won't auto-update. To upgrade:

rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest
opencode plugin add @rahadiana/opencode-ultrapress@latest --global

The plugin will warn you at startup if a newer version is available.

Alternative install methods (not recommended for end users):

# Via npm β€” requires manual registration (step 2), same cache caveat applies
npm install -g @rahadiana/opencode-ultrapress

# GitHub latest β€” for testing pre-release changes
npm install -g github:rahadiana/opencode-ultrapress

For plugin development:

git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build

# Link globally so OpenCode can find it
npm link

# Then in OpenCode's config (~/.config/opencode/config.json), add:
# { "plugins": ["@rahadiana/opencode-ultrapress"] }
#
# After making code changes, re-run:
# npm run build
#
# No need to re-link β€” OpenCode loads from the linked directory.
# If using opencode plugin add, remove the cached version first:
# rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest

2. Register to OpenCode (npm install only)

If you installed via opencode plugin, skip this step β€” registration is automatic.

Add the plugin to OpenCode's config at ~/.config/opencode/config.json:

{
  "plugins": ["@rahadiana/opencode-ultrapress"]
}

3. (Optional) Create Personal Configuration

UltraPress works out-of-the-box with Balanced defaults. For customization:

# Install from GitHub β†’ copy from the cloned repo
cp ultrapress.plugin.jsonc.example ~/.config/opencode/ultrapress.plugin.json

# Or from global install
cp $(npm root -g)/@rahadiana/opencode-ultrapress/ultrapress.plugin.json.example ~/.config/opencode/ultrapress.plugin.json

Then edit ~/.config/opencode/ultrapress.plugin.json as needed. If the file is not found, UltraPress will automatically create it with default values on first run.

Quick-Start Profiles by Device

Use this as a practical starting point:

Device / Environment Recommended Mode Why
Low-RAM machine / Termux / tiny VPS semantic.mode: "nlp" Zero model download, lowest RAM/CPU overhead
Typical laptop/dev machine Balanced defaults Best trade-off: token savings + context safety
High-RAM workstation mlm or llm (optional) Higher semantic quality, more resource usage

Minimal override examples:

// Low-resource profile
{
  "semantic": { "mode": "nlp" }
}
// Higher-quality semantic profile (requires more RAM)
{
  "semantic": {
    "mode": "mlm",
    "model": "Xenova/all-MiniLM-L6-v2"
  }
}

Verify Installation

Restart OpenCode, then type in chat:

/up stats

If a statistics dashboard appears, UltraPress is active. If not:

  1. Installed via npm install? Make sure the plugin is in config.json β†’ "plugins": ["@rahadiana/opencode-ultrapress"]
  2. Installed via opencode plugin? Run opencode plugin list to confirm it's registered.
  3. Check OpenCode logs for errors
  4. Ensure Node.js >= 18 is installed (node --version)

Uninstall

# 1. Remove from OpenCode plugin list
#    Edit ~/.config/opencode/config.json β€” remove "@rahadiana/opencode-ultrapress" from plugins array

# 2. Purge the cached version
rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest

# 3. If installed via npm (global)
npm uninstall -g @rahadiana/opencode-ultrapress

# 4. If using npm link
npm unlink -g @rahadiana/opencode-ultrapress

# 5. Clean up config
rm ~/.config/opencode/ultrapress.plugin.json

πŸ›  4-Layer Architecture

UltraPress intercepts OpenCode message flow at 4 different points, each with a specific compression strategy.

Pipeline Flow

flowchart LR
    A([Tool Output]) -->|"tool.execute.after"| L1[Layer 1\nOutput Filter]
    B([Chat Message]) -->|"chat.message"| PRUNE{Prune\nPending?}
    PRUNE -->|yes| REMOVE[Remove msgs\n& inject summary]
    PRUNE -->|no| L2[Layer 2\nGSC Semantic]
    REMOVE --> L2
    L2 --> L3[Layer 3\nNudge Monitor]
    L3 -->|"nudge injected"| LLM{LLM}
    LLM -->|"calls tool"| DCP_TOOL[ultrapress_compress]
    DCP_TOOL -.->|"block stored"| PRUNE
    D([Session Compact]) -->|"session.compacting"| L4[Layer 4\nCleanup]

    L1 & L2 & L4 --> CTX[(Context\nWindow)]

Layer 1 β€” Smart Output Filter

Hook: tool.execute.after Β· File: layer1-output-filter.ts Β· Filters: src/filters/

Intercepts CLI tool output before it enters the context window. The most aggressive layer β€” directly cuts unnecessary logs.

Core Strategies:

Strategy Description
Domain Routing Each CLI tool is routed to a specific filter: git, npm/node, pytest/jest, and filesystem. Unknown tools go to the generic filter.
Middle-out Truncation Truncates logs from the middle, preserving the beginning (context) and end (error/result). Smarter than head/tail truncation.
Deduplication Removes identical repeated log lines in real-time. Very effective for build & test logs.
Tee Save If output is truncated, the original log is saved to a temporary .log file so it can still be accessed if needed.

Built-in Filters:

Filter File Trigger Tools
Git filters/git.ts git diff, git log, git show β€” remove redundant diff hunks, keep summary
Test filters/test.ts pytest, jest, vitest, mocha β€” summarize failure output, remove passing tests
Bash filters/bash.ts Generic shell output β€” dedup lines, middle-out truncation
Filesystem filters/fs.ts ls, cat, find β€” limit file count, truncate long content
Generic filters/generic.ts Fallback for all other tools β€” middle-out truncation + dedup

Layer 2 β€” GSC Semantic Compression

Hook: chat.message Β· File: layer2-caveman.ts Β· Engine: src/caveman/

Compresses message text semantically β€” removes unimportant words without changing meaning. Layer 2 and Layer 3 do not compress each other β€” no double compression.

Compression Rules:

  • Conjunctions (that, and, will, which) β†’ removed
  • Excessive pronouns β†’ condensed
  • Redundancy ("I think I will" β†’ "I will") β†’ removed
  • Double spaces, unnecessary whitespace β†’ normalized
  • Code blocks (inside ```) β†’ NEVER touched
  • Error messages & stack traces β†’ fully protected
  • Messages < 200 characters β†’ skipped

Operating Modes:

See ❷ MLM & NLP Support for detailed NLP vs MLM comparison.


Layer 3 β€” Dynamic Context Pruning (DCP)

Hook: chat.message (pruning) + tool.execute.after (compress tool) Β· File: layer3-dcp.ts Β· Engine: src/dcp/

The most advanced system: gives LLM autonomy to manage its own memory. Unlike Layer 2 which only compresses text, Layer 3 actually removes old messages from the context window and replaces them with summaries.

Mechanism:

1. Context Monitor β†’ detect tokens approaching maxContextLimit
2. Autonomous Nudge β†’ inject prompt into user message: "context window nearly full, call ultrapress_compress"
3. LLM calls β†’ ultrapress_compress(mode="range", from=<id>, to=<id>)
4. Compression Block β†’ stored in memory (compress-state.ts)
5. chat.message hook β†’ check pending blocks β†’ remove messages in range β†’ inject summary as synthetic message

Key Features:

Feature Description
Block-based Pruning When ultrapress_compress is called, LLM determines the message range to summarize. Block is stored, then executed on the next chat (not current β€” avoids race condition).
prune via chat.message Hook Each new message β†’ plugin checks pending blocks β†’ removes messages in range from context array β†’ injects summary.
Protected Content Critical tool output (task, skill, todowrite, todoread, write, edit, ultrapress_compress) is protected from pruning.
Marker Protection Messages containing TODO, FIXME, HACK, ACTION ITEM, ROOT CAUSE, RCA, DECISION, BLOCKER are preserved from pruning.
Nesting Support Compression can be done on top of previous compression. Nested summaries are auto-merged.
preserveLastN Protects the last N messages from pruning β€” keeps recent conversation context intact. Default: 4. Set to 0 to disable.
Multi-Signal Scoring In addition to preserveLastN, each message is scored from 5 signals (recency, role, tool type, keyword, content size). High-scoring messages are preserved even in old blocks. Default: 0.45 (balanced).
Reversible Compression ultrapress_expand tool β€” LLM can "expand" previously summarized blocks to see the original content. Original content is stored in plugin memory (not in LLM context).
Nudge @70% Nudge is sent when context reaches 70% limit (not 100%), giving LLM time to compress before context is truly full.
summaryBuffer After pruning, provides breathing room (no immediate re-nudge).

Two Pruning Modes:

Mode Description
range Range-based compression: choose from_id and to_id. All messages in between are summarized into one.
message Surgical compression: choose one or more specific message IDs to summarize.

Layer 4 β€” Session Auto-Cleanup

Hook: tool.execute.after + session.compacting Β· File: layer4-cleanup.ts

Automatically cleans "garbage" from the context window.

Features:

Feature Description
Error Purging Removes error/failed tool messages after N chat turns (default: 4 turns). Stale errors only waste tokens.
Tool-Call Dedup Prevents LLM from repeating identical tool calls (same tool + args) in the same session.

βš™οΈ Configuration

πŸ“– Full configuration documentation (all keys, types, defaults, examples, presets, custom filters, troubleshooting) at: docs/konfigurasi-lengkap.md

Basic Structure

File: ~/.config/opencode/ultrapress.plugin.json

{
  "enabled": true,           // Master switch
  "notification": "minimal", // "off" | "minimal" | "detailed"

      "outputFilter": {},
      "semantic": {},
      "summarization": {},
      "cleanup": {}
}
Layer Key Function
L1 outputFilter Limit CLI output length, filter repetitive lines
L2 semantic NLP/MLM text compression without destroying meaning
L3 summarization Remove old messages, replace with summary, preserveLastN protection
L4 cleanup Dedup tool calls, auto-purge stale errors

πŸ”’ Safety guard: task is always enforced in outputFilter.skipTools and semantic.skipTools even if removed in user config. This prevents accidental sub-agent context loss.

πŸ‘‰ Open full documentation β†’ covers all keys, types, defaults, custom filters, presets (Balanced / Aggressive / Conservative / NLP-only), and troubleshooting.


⌨️ /up Slash Command

All interactions with UltraPress via a single command: /up.

Sub-command List

Command Alias Description
/up stats s, stat Current session token savings dashboard
/up context c, ctx Context window status: capacity, limit, remaining
/up compress comp Show layer status + compression guide
/up help h, ? Command help

Note: Sub-commands are case-insensitive and support partial fuzzy matching.

Example Output

/up stats:

πŸ“Š ULTRAPRESS STATS
──────────────────────────────────────────
  Raw tokens       : 127,450
  Compressed tokens: 89,215
  Tokens saved     : 38,235 (30.0%)

  By Layer:
  L1 Output Filter : 18,400
  L2 Semantic      : 12,100
  L3 Summarization :  5,835
  L4 Cleanup       :  1,900

  Activity:
  Compressions : 3
  Deduplications: 12
  Errors purged: 2

  Session: 2h 15m
──────────────────────────────────────────

/up context:

🧠 CONTEXT STATUS
──────────────────────────────────────────
  Current tokens     : ~52,000
  Max context limit  : 70,000
  Available          : ~18,000 (25.7%)
  Nudge threshold    : 40,000
  Status             : 🟑 Nearing limit (nudge will fire soon)
  Next nudge in      : 3 turns
──────────────────────────────────────────

❷ MLM & NLP Support

NLP Mode (Default)

Rule-based grammar stripping using linguistic rules. Zero latency, no external model required.

How it works:

  1. Detect sentence structure (subject, predicate, object)
  2. Remove conjunctions, excessive pronouns, filler words
  3. Condense redundancy without changing meaning
  4. Protect code blocks & error messages

MLM Mode (Experimental)

Uses Masked Language Model via @huggingface/transformers (Transformers.js) for more accurate tokenization.

Activation:

{
  "semantic": {
    "mode": "mlm",
    "model": "Xenova/distilbert-base-uncased"
  }
}

Important Notes:

  • ⚠️ Model auto-downloaded on first run (~70MB for distilbert-base)
  • ⚠️ First-run latency 5-15 seconds for model loading
  • ⚠️ RAM usage increases ~200MB when model is active
  • ⚠️ Compatibility: CPU-only (no GPU required)
  • 🌐 For Indonesian: use Xenova/bert-base-multilingual-uncased

Mode Comparison

Aspect NLP MLM LLM
Latency < 1ms 50-200ms 1-5s
RAM 0 MB ~70 MB (q8) ~300 MB (q8)
Accuracy ~85% ~95% ~99%
Language Indonesian + English 100+ languages All
Internet Connection ❌ Not needed ❌ Initial download only ❌ Initial download only
Stable βœ… ⚠️ Experimental ⚠️ Experimental
Model β€” all-MiniLM-L6-v2 t5-small (summarization)

πŸ— Code Architecture

Directory Structure

opencode-ultrapress/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts                    # Entry point, hook registration, plugin server
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ schema.ts               # TypeScript type definitions (UltraPressConfig, etc.)
β”‚   β”‚   └── defaults.ts             # Default config values + merge logic
β”‚   β”œβ”€β”€ layers/
β”‚   β”‚   β”œβ”€β”€ layer1-output-filter.ts # RTK engine β€” routes tool output to domain filters
β”‚   β”‚   β”œβ”€β”€ layer2-caveman.ts       # Semantic compression orchestrator
β”‚   β”‚   β”œβ”€β”€ layer3-dcp.ts           # DCP orchestrator β€” nudge injection, pruning trigger
β”‚   β”‚   └── layer4-cleanup.ts       # Auto-cleanup β€” dedup + error purging
β”‚   β”œβ”€β”€ filters/
β”‚   β”‚   β”œβ”€β”€ git.ts                  # Git-specific output filter
β”‚   β”‚   β”œβ”€β”€ test.ts                 # Test runner output filter
β”‚   β”‚   β”œβ”€β”€ bash.ts                 # Shell output filter
β”‚   β”‚   β”œβ”€β”€ fs.ts                   # Filesystem tool output filter
β”‚   β”‚   └── generic.ts              # Fallback filter
β”‚   β”œβ”€β”€ dcp/
β”‚   β”‚   β”œβ”€β”€ compress-state.ts       # In-memory state for pending compression blocks
β”‚   β”‚   β”œβ”€β”€ compress-tool.ts        # ultrapress_compress tool definition & handler
β”‚   β”‚   β”œβ”€β”€ context-monitor.ts      # Token usage monitoring + nudge logic
β”‚   β”‚   β”œβ”€β”€ prune.ts                # Message removal + summary injection engine
β”‚   β”‚   β”œβ”€β”€ protected-content.ts    # Defines which tool outputs are protected
β”‚   β”‚   └── summary-store.ts        # Stores summaries for nesting support
β”‚   β”œβ”€β”€ caveman/
β”‚   β”‚   β”œβ”€β”€ nlp.ts                  # Rule-based NLP compressor
β”‚   β”‚   └── mlm.ts                  # MLM-based compressor (Transformers.js)
β”‚   β”œβ”€β”€ commands/
β”‚   β”‚   └── slash.ts                # /up slash command handler
β”‚   └── utils/
β”‚       β”œβ”€β”€ token-count.ts          # Token estimation (char-based approximation)
β”‚       └── logger.ts               # Logging with configurable verbosity
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ layer1.test.ts              # Output filter unit tests
β”‚   β”œβ”€β”€ layer2.test.ts              # Semantic compression unit tests
β”‚   └── layer3-dcp.test.ts          # DCP pruning + nudge unit tests
β”œβ”€β”€ benchmarks/
β”‚   β”œβ”€β”€ run.ts                      # Benchmark runner
β”‚   └── fixtures/                   # Benchmark test data
β”œβ”€β”€ docs/
β”‚   └── image/
β”‚       └── banner.svg              # README banner
β”œβ”€β”€ ultrapress.plugin.jsonc.example        # Configuration template (JSONC)
β”œβ”€β”€ tsconfig.json                   # TypeScript config
β”œβ”€β”€ tsup.config.ts                  # Build config (tsup)
β”œβ”€β”€ package.json
β”œβ”€β”€ CHANGELOG.md
β”œβ”€β”€ LICENSE
└── README.md

Hook Registration Map

OpenCode Hook Trigger UltraPress Handler Layer
tool.execute.after After CLI tool completes Output filtering + token tracking + dedup L1, L4
chat.message Before user message is sent to LLM Pruning pending blocks + semantic compression + nudge injection L2, L3
command.execute.before User types /up Slash command handler β€”
experimental.session.compacting OpenCode compacting session Protected context injection L4
config Plugin initialization Register /up command β€”
tool (definition) Plugin init Register ultrapress_compress tool L3

Data Flow Detail

1. Plugin Init
   config hook β†’ register /up command
   tool definition β†’ register ultrapress_compress
   load/migrate config from ~/.config/opencode/ultrapress.plugin.json

2. Tool Execution (every tool call)
   tool.execute.after β†’ L1 processToolOutput()
     → Domain routing (git→git.ts, test→test.ts, etc.)
     β†’ Middle-out truncation
     β†’ Deduplication
   β†’ L4 applyCleanup()
     β†’ Dedup check
     β†’ Error registration for purge

3. Chat Message (every user message)
   chat.message β†’ L3 check pending compression blocks
     β†’ applyPruning() β€” remove old messages, inject summaries
   β†’ L2 processMessageContext() β€” semantic compression
   β†’ L3 context monitor β€” check token count
     β†’ if near limit β†’ inject nudge prompt
   β†’ L3 turnTick() β€” update turn counter

4. LLM calls ultrapress_compress
   tool.execute.after β†’ compress tool handler
     β†’ Create CompressionBlock in compress-state.ts
     β†’ Store summary for nesting
   β†’ Block will be executed on NEXT chat.message

5. Session Compacting
   session.compacting β†’ L4 protected context injection

πŸ§ͺ Testing

Run the entire test suite:

bun test
Test File Coverage Layer
tests/layer1.test.ts Output filtering: domain routing, truncation, tee save, dedup L1
tests/layer2.test.ts Semantic compression: NLP grammar stripping, code block protection, min length skip L2
tests/layer3-dcp.test.ts DCP: pruning with preserveLastN, nudge frequency, nesting summaries, protected content L3
# Run specific layer
bun test tests/layer1.test.ts
bun test tests/layer2.test.ts
bun test tests/layer3-dcp.test.ts

# Run with TypeScript type checking
bun run lint

πŸ“Š Benchmark

Run the full benchmark to measure the effectiveness of all 4 layers:

npm run benchmark

Latest Benchmark Results

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Fixture                         β”‚Layer                                     β”‚Original    β”‚Compressed    β”‚Savings   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚git-diff-large.txt              β”‚L1 β€” Git Filter                           β”‚       1,657β”‚           969β”‚       42%β”‚
β”‚npm-install-log.txt             β”‚L1 β€” Generic Filter                       β”‚         431β”‚           430β”‚        0%β”‚
β”‚pytest-log.txt                  β”‚L1 β€” Generic Filter                       β”‚       1,200β”‚         1,199β”‚        0%β”‚
β”‚chat-history.json               β”‚L2 β€” NLP Semantic                         β”‚         625β”‚           490β”‚       22%β”‚
β”‚dcp-conversation.json           β”‚L3 β€” DCP Pruning (14β†’summary)             β”‚       2,347β”‚           645β”‚       73%β”‚
β”‚                                β”‚  ↳ 10 msg removed, 1 summary injected    β”‚            β”‚              β”‚          β”‚
β”‚3x identical npm test           β”‚L4 β€” Tool Call Dedup                      β”‚       2,244β”‚           854β”‚       62%β”‚
β”‚                                β”‚  ↳ 2 duplicates collapsed                β”‚            β”‚              β”‚          β”‚
β”‚5 errors Γ— 6 turns              β”‚L4 β€” Error Auto-Purge                     β”‚         845β”‚             0β”‚      100%β”‚
β”‚                                β”‚  ↳ 5 errors purged after threshold       β”‚            β”‚              β”‚          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

βœ… Total: 9,349 β†’ 4,587 tokens (51% overall savings)

Layer Summary

Layer Fixture Avg Savings Characteristics
L1 Output Filter 3 fixtures 21% Most effective for verbose CLI logs (git diff: 42%). Short output less affected.
L2 Semantic NLP 1 fixture 22% Consistently compresses natural language without destroying meaning. Code blocks fully protected.
L3 DCP Pruning 1 fixture 73% Biggest saver β€” removes 10 old messages & replaces with 1 summary. Compounding effect in long sessions.
L4 Auto Cleanup 2 fixtures 72% Dedup saves 62% from repeated tool calls. Error purge 100% after threshold.

πŸ’‘ Insight: L3 (DCP) is the layer with the highest savings because it removes old messages in bulk. In long sessions (100+ messages), the cumulative effect of L3 + L4 can reach 70-90% token savings. Dataset and scripts are in benchmarks/ β€” contribute fixtures from your stack for more representative results.


πŸš€ Local Development

# 1. Clone repository
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress

# 2. Install dependencies
npm install

# 3. Build TypeScript
npm run build          # tsup β€” compile to dist/

# 4. Development mode (watch)
npm run dev            # tsup --watch

# 5. Run tests
npm test               # bun test

# 6. Type checking
npm run lint           # tsc --noEmit

# 7. Benchmark
npm run benchmark      # tsx benchmarks/run.ts

Development Workflow:

  1. Edit files in src/
  2. npm run dev for auto-rebuild
  3. npm test to verify
  4. Restart OpenCode to reload plugin
  5. Test via /up stats in chat

❓ FAQ & Troubleshooting

Plugin doesn't appear after install
  1. Ensure the plugin is registered in ~/.config/opencode/config.json:
    { "plugins": ["@rahadiana/opencode-ultrapress"] }
  2. Restart OpenCode completely (not just reload window)
  3. Check if the package is installed: npm list -g @rahadiana/opencode-ultrapress
Error "Cannot find module @huggingface/transformers"

MLM mode requires an additional dependency. Install manually:

npm install -g @huggingface/transformers

Or switch to "nlp" mode which does not require external dependencies.

OpenCode feels slow after install
  • Check semantic mode: "mode": "mlm" β€” MLM model loading at startup can be slow. Switch to "nlp" for zero latency.
  • Check notification level: "detailed" prints many logs. Set to "minimal".
  • Ensure minLengthChars is not too low (default 250 is optimal).
My important messages were deleted by pruning
  • Increase preserveLastN (default 4 β†’ try 6 or 7)
  • Important tool output is automatically protected (task, skill, todowrite, todoread, write, edit)
  • Decision markers are also protected (TODO, FIXME, ACTION ITEM, ROOT CAUSE, DECISION, BLOCKER)
  • If still deleted, report as a bug with detailed logs
How do I disable a specific layer?

Set "enabled": false on the layer you want to turn off:

{
  "semantic": { "enabled": false },
  "summarization": { "enabled": false }
}
TypeScript error during development

Ensure dependencies are installed:

npm install
npm run lint    # tsc --noEmit to check for type errors

πŸ—Ί Roadmap

Feature Status Target
Layer 1: Domain-aware output filtering βœ… Done v0.1.0
Layer 2: NLP semantic compression βœ… Done v0.1.0
Layer 2: MLM mode ⚠️ Experimental v0.2.0
Layer 2: LLM mode (local summarization) βœ… Done v0.2.0
Layer 2: All-pairs MLM dedup βœ… Done v0.2.0
Layer 3: Block-based DCP pruning βœ… Done v0.1.0
Layer 3: preserveLastN protection βœ… Done v0.1.0
Layer 3: Multi-signal importance scoring βœ… Done v0.2.0
Layer 3: Reversible compression (ultrapress_expand) βœ… Done v0.2.0
Layer 3: Pre-emptive nudge @70% βœ… Done v0.2.0
Layer 3: Surgical message pruning βœ… Done v0.1.0
Layer 4: Error purging & dedup βœ… Done v0.1.0
/up slash commands βœ… Done v0.1.0
Real token tracking (OpenCode API) βœ… Done v0.2.0
Custom filter API βœ… Done v0.1.0
TF-IDF scoring (MLM improvement) 🚧 Planned v0.2.0
Sentence similarity (MLM improvement) 🚧 Planned v0.2.0
Sub-agent (task) token tracking & compression πŸ’‘ Idea TBD
UI stats dashboard in OpenCode πŸ’‘ Idea TBD
Support more languages (NLP) πŸ’‘ Idea TBD

🀝 Contributing

Contributions are welcome! Areas most in need of help:

  1. New Filters: Add Layer 1 filters for new frameworks/stacks (Kubernetes, Docker, Terraform, Svelte, Flutter, etc.).
  2. MLM Roadmap: Help implement actual TF-IDF scoring or sentence similarity.
  3. Benchmark Dataset: Contribute fixture data from your tech stack.
  4. Multi-language NLP: Expand grammar stripping rules for more languages.
  5. Bug Reports: Report edge cases β€” poorly filtered tool output, important messages deleted, etc.

Development Setup

git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build
npm test
npm run benchmark

Pull Request Process

  1. Fork repository
  2. Create feature branch (git checkout -b feature/amazing-filter)
  3. Commit changes (git commit -m 'Add amazing filter')
  4. Push to branch (git push origin feature/amazing-filter)
  5. Open Pull Request β€” ensure bun test and npm run lint pass

πŸ“ Changelog

See CHANGELOG.md for the full version history.


πŸ“„ License

MIT Β© rahadiana


UltraPress β€” Because tokens are expensive, but context is priceless. ❀️