JSPM

open-agents-ai

0.16.0
    • ESM via JSPM
    • ES Module Entrypoint
    • Export Map
    • Keywords
    • License
    • Repository URL
    • TypeScript Types
    • README
    • Created
    • Published
    • Downloads 26260
    • Score
      100M100P100Q142379F
    • License MIT

    AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop

    Package Exports

    • open-agents-ai
    • open-agents-ai/dist/index.js

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (open-agents-ai) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    npm version npm downloads license node version open-weight models

    freedom of information · freedom of patterns · creating freely · open-weights
    libertad de informacion · crear libremente · creer librement · liberte d'expression
    Freiheit der Muster · jiyuu ni souzou suru · jayuroun changjak · svoboda tvorchestva
    liberdade de criar · creare liberamente · ozgurce yarat · skapa fritt
    vrij creeren · tworz swobodnie · dimiourgia elefthera · khuli soch
    hurriyat al-ibdaa · code is poetry · democratize AI · imagine freely


    Open Agents

    npm i -g open-agents-ai && oa

    AI coding agent powered entirely by open-weight models. No API keys. No cloud. Your code never leaves your machine.

    An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.

    Features

    • 26 autonomous tools — file I/O, shell, grep, web search/fetch, memory, sub-agents, background tasks, image/OCR, git, diagnostics
    • Parallel tool execution — read-only tools run concurrently via Promise.allSettled
    • Sub-agent delegation — spawn independent agents for parallel workstreams
    • Ralph Loop — iterative task execution that keeps retrying until completion criteria are met
    • Dream Mode — creative idle exploration modeled after real sleep architecture (NREM→REM cycles)
    • Live Listen — bidirectional voice communication with real-time Whisper transcription
    • Neural TTS — hear what the agent is doing via GLaDOS or Overwatch ONNX voices
    • Auto-expanding context — detects RAM/VRAM and creates an optimized model variant on first run
    • Mid-task steering — type while the agent works to add context without interrupting
    • Smart compaction — long conversations compressed preserving files, commands, errors, decisions
    • Persistent memory — learned patterns stored in .oa/memory/ across sessions
    • Self-learning — auto-fetches docs from the web when encountering unfamiliar APIs
    • Seamless /update — in-place update and reload without losing context

    How It Works

    You: oa "fix the null check in auth.ts"
    
    Agent: [Turn 1] file_read(src/auth.ts)
           [Turn 2] grep_search(pattern="null", path="src/auth.ts")
           [Turn 3] file_edit(old_string="if (user)", new_string="if (user != null)")
           [Turn 4] shell(command="npm test")
           [Turn 5] task_complete(summary="Fixed null check — all tests pass")

    The agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.

    Ralph Loop — Iteration-First Design

    The Ralph Loop is the core execution philosophy: iteration beats perfection. Instead of trying to get everything right on the first attempt, the agent executes in a retry loop where errors become learning data rather than session-ending failures.

    /ralph "fix all failing tests" --completion "npm test passes with 0 failures"
    /ralph "migrate to TypeScript" --completion "npx tsc --noEmit exits 0" --max-iterations 20
    /ralph "reach 80% coverage" --completion "coverage report shows >80%" --timeout 120

    Each iteration:

    1. Execute — make changes based on the task + all accumulated learnings
    2. Verify — run the completion command (tests, build, lint, coverage)
    3. Learn — if verification fails, extract what went wrong and why
    4. Iterate — retry with the new knowledge until passing or limits reached

    The loop tracks iteration history, generates completion reports saved to .aiwg/ralph/, and supports resume/abort for interrupted sessions. Safety bounds (max iterations, timeout) prevent runaway loops.

    /ralph-status     # Check current/previous loop status
    /ralph-resume     # Resume interrupted loop
    /ralph-abort      # Cancel running loop

    Dream Mode — Creative Idle Exploration

    When you're not actively tasking the agent, Dream Mode lets it creatively explore your codebase and generate improvement proposals autonomously. The system models real human sleep architecture with four stages per cycle:

    Stage Name What Happens
    NREM-1 Light Scan Quick codebase overview, surface observations
    NREM-2 Pattern Detection Identify recurring patterns, technical debt, gaps
    NREM-3 Deep Consolidation Synthesize findings into structured proposals
    REM Creative Expansion Novel ideas, cross-domain connections, bold plans

    Each cycle expands through all four stages then contracts (evaluation, pruning of weak ideas). Three modes control how far the agent can go:

    /dream              # Default — read-only exploration, proposals saved to .oa/dreams/
    /dream deep         # Multi-cycle deep exploration with expansion/contraction phases
    /dream lucid        # Full implementation — saves workspace backup, then implements,
                        #   tests, evaluates, and self-plays each proposal with checkpoints
    /dream stop         # Wake up — stop dreaming

    Default and Deep modes are completely safe — the agent can only read your code and write proposals to .oa/dreams/. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.

    Lucid mode unlocks full write access. Before making changes, it saves a workspace checkpoint so you can roll back. Each cycle goes: dream → implement → test → evaluate → checkpoint → next cycle.

    All proposals are indexed in .oa/dreams/PROPOSAL-INDEX.md for easy review.

    Listen Mode — Live Bidirectional Audio

    Listen mode enables real-time voice communication with the agent. Your microphone audio is captured, streamed through Whisper (via transcribe-cli), and the transcription is injected directly into the input line — creating a hands-free coding workflow.

    /listen             # Toggle microphone capture on/off
    /listen auto        # Auto-submit after 3 seconds of silence (hands-free)
    /listen confirm     # Require Enter to submit transcription (default)
    /listen stop        # Stop listening

    Model selection — choose the Whisper model size for your hardware:

    /listen tiny        # Fastest, least accurate (~39MB)
    /listen base        # Good balance (~74MB)
    /listen small       # Better accuracy (~244MB)
    /listen medium      # High accuracy (~769MB)
    /listen large       # Best accuracy, slower (~1.5GB)

    When combined with /voice, you get full bidirectional audio — speak your tasks, hear the agent's progress through TTS, and speak corrections mid-task. The status bar shows a blinking red ● REC indicator with a countdown timer during auto-mode recording.

    Platform support:

    • Linux: arecord (ALSA) or ffmpeg (PulseAudio)
    • macOS: sox (CoreAudio) or ffmpeg (AVFoundation)

    The transcribe-cli dependency auto-installs in the background on first use.

    File transcription: Drag-and-drop audio/video files (.mp3, .wav, .mp4, .mkv, etc.) onto the terminal to transcribe them. Results are saved to .oa/transcripts/.

    Interactive TUI

    Launch without arguments to enter the interactive REPL:

    oa

    The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.

    Slash Commands

    Command Description
    /help Show all available commands
    /model <name> Switch to a different Ollama model
    /endpoint <url> Connect to a remote vLLM or OpenAI-compatible API
    /voice [model] Toggle TTS voice (GLaDOS, Overwatch)
    /listen [mode] Toggle live microphone transcription
    /dream [mode] Start dream mode (default, deep, lucid)
    /stream Toggle streaming token display
    /bruteforce Toggle brute-force mode (auto re-engage on turn limit)
    /tools List available tools
    /skills List/search available skills
    /update Check for and install updates (seamless reload)
    /config Show current configuration
    /clear Clear the screen
    /exit Quit

    Mid-Task Steering

    While the agent is working (shown by the + prompt), type to add context:

    > fix the auth bug
      ⎿  Read: src/auth.ts
    + also check the session handling        ← typed while agent works
      ↪ Context added: also check the session handling
      ⎿  Search: session
      ⎿  Edit: src/auth.ts

    Tools (26)

    Tool Description
    file_read Read file contents with line numbers (offset/limit)
    file_write Create or overwrite files
    file_edit Precise string replacement in files
    shell Execute any shell command
    grep_search Search file contents with regex (ripgrep)
    find_files Find files by glob pattern
    list_directory List directory contents
    web_search Search the web via DuckDuckGo
    web_fetch Fetch and extract text from web pages
    memory_read Read from persistent memory store
    memory_write Store patterns for future sessions
    batch_edit Multiple edits across files in one call
    codebase_map High-level project structure overview
    diagnostic Lint/typecheck/test/build validation pipeline
    git_info Structured git status, log, diff, branch info
    background_run Run shell command in background
    task_status Check background task status
    task_output Read background task output
    task_stop Stop a background task
    sub_agent Delegate to an independent agent
    image_read Read images (base64 + OCR)
    screenshot Capture screen/window
    ocr Extract text from images
    aiwg_setup Deploy AIWG SDLC framework
    aiwg_health Analyze SDLC health
    aiwg_workflow Execute AIWG workflows

    Read-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.

    Auto-Expanding Context Window

    On startup and /model switch, Open Agents detects your RAM/VRAM and creates an optimized model variant:

    Available Memory Context Window
    200GB+ 128K tokens
    100GB+ 64K tokens
    50GB+ 32K tokens
    20GB+ 16K tokens
    8GB+ 8K tokens
    < 8GB 4K tokens

    Voice Feedback (TTS)

    /voice              # Toggle on/off (default: GLaDOS)
    /voice glados       # GLaDOS voice
    /voice overwatch    # Overwatch voice

    Auto-downloads the ONNX voice model (~50MB) on first use. Install espeak-ng for best quality (apt install espeak-ng / brew install espeak-ng).

    Configuration

    Config priority: CLI flags > env vars > ~/.open-agents/config.json > defaults.

    open-agents config set model qwen3.5:122b
    open-agents config set backendUrl http://localhost:11434

    Project Context

    Create AGENTS.md, OA.md, or .open-agents.md in your project root for agent instructions. Context files merge from parent to child directories.

    .oa/ Project Directory

    .oa/
    ├── config.json        # Project config overrides
    ├── settings.json      # TUI settings
    ├── memory/            # Persistent memory store
    ├── dreams/            # Dream mode proposals & checkpoints
    ├── transcripts/       # Audio/video transcriptions
    ├── index/             # Cached codebase index
    ├── context/           # Auto-generated project context
    └── history/           # Session history

    Model Support

    Primary target: Qwen3.5-122B-A10B via Ollama (MoE, 48GB+ VRAM)

    Any Ollama or OpenAI-compatible API model with tool calling works:

    oa --model qwen2.5-coder:32b "fix the bug"
    oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
    oa --backend-url http://10.0.0.5:11434 "refactor auth"

    Evaluation Suite

    23 evaluation tasks test the agent's autonomous capabilities across coding, web research, SDLC analysis, and tool creation:

    node eval/run-agentic.mjs                          # Run all 23 tasks
    node eval/run-agentic.mjs 04-add-test              # Single task
    node eval/run-agentic.mjs --model qwen2.5-coder:32b  # Different model
    ID Task Category
    01 Fix typo in function name Code Fix
    02 Add isPrime function Code Generation
    03 Fix off-by-one bug Code Fix
    04 Write comprehensive tests Test Generation
    05 Extract functions from long method Refactoring
    06 Fix TypeScript type errors Type Safety
    07 Add REST API endpoint Feature Addition
    08 Add pagination across files Multi-File Edit
    09 CSS named color lookup (148 colors) Web Research
    10 HTTP status code lookup (32+ codes) Web Research
    11 MIME type lookup (30+ types) Web Research
    12 SDLC health analyzer AIWG Analysis
    13 SDLC artifact generator AIWG Generation
    14 Batch refactor variable names Multi-File Refactor
    15 Codebase overview from structure Code Analysis
    16 Diagnostic fix loop Error Recovery
    17 Git repository analyzer Git Integration
    18 Create custom tool from spec Tool Creation
    19 Tool from usage pattern Tool Discovery
    20 Tool management operations Tool Lifecycle
    21 Large file patch Precision Editing
    22 Skill discovery Skill System
    23 Skill execution Skill System

    Benchmark Results (Qwen3.5-122B)

    Pass rate: 100% (8/8 core tasks)
    Total: 39 turns, 55 tool calls, ~10 minutes
    Average: 4.9 turns/task, 6.9 tools/task

    AIWG Integration

    Open Agents integrates with AIWG for AI-augmented software development:

    npm i -g aiwg
    oa "analyze this project's SDLC health and set up documentation"
    Capability Description
    Structured Memory .aiwg/ directory persists project knowledge
    SDLC Artifacts Requirements, architecture, test strategy, deployment docs
    Health Analysis Score your project's SDLC maturity
    85+ Agents Specialized AI personas (Test Engineer, Security Auditor, API Designer)
    Traceability @-mention system links requirements to code to tests

    Architecture

    The core is AgenticRunner — a multi-turn tool-calling loop:

    User task → System prompt + tools → LLM → tool_calls → Execute → Feed results → LLM
                                              (repeat until task_complete or max turns)
    • Tool-first — the model explores via tools, not pre-stuffed context
    • Iterative — tests, sees failures, fixes them
    • Parallel-safe — read-only tools concurrent, mutating tools sequential
    • Observable — every tool call and result emitted as a real-time event
    • Bounded — max turns, timeout, output limits prevent runaway loops

    License

    MIT