JSPM

open-agents-ai

0.14.1
    • ESM via JSPM
    • ES Module Entrypoint
    • Export Map
    • Keywords
    • License
    • Repository URL
    • TypeScript Types
    • README
    • Created
    • Published
    • Downloads 26260
    • Score
      100M100P100Q140444F
    • License MIT

    AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop

    Package Exports

    • open-agents-ai
    • open-agents-ai/dist/index.js

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (open-agents-ai) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    Open Agents

    AI coding agent framework powered by open-weight models via Ollama.

    A multi-turn agentic tool-calling loop that iteratively reads code, makes changes, runs tests, and fixes failures until the task is complete — modeled after how Claude Code operates, but running entirely on local open-weight models.

    How It Works

    You: oa "fix the null check in auth.ts"
    
    Agent: [Turn 1] file_read(src/auth.ts)
           [Turn 2] grep_search(pattern="null", path="src/auth.ts")
           [Turn 3] file_edit(old_string="if (user)", new_string="if (user != null)")
           [Turn 4] shell(command="npm test")
           [Turn 5] task_complete(summary="Fixed null check — all tests pass")

    The agent has 18 tools (including 3 AIWG SDLC tools and 4 advanced analysis tools) and uses them autonomously in a loop, reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.

    Quick Start

    # Install globally — provides `open-agents` and `oa` commands
    npm i -g open-agents-ai
    
    # Run it — first launch auto-detects your system and pulls the best model
    oa "fix the null check in auth.ts"

    On first run, the setup wizard detects your RAM/VRAM and recommends the optimal qwen3.5 variant.

    Install from source

    # 1. Install Ollama (https://ollama.com)
    curl -fsSL https://ollama.com/install.sh | sh
    
    # 2. Pull the model
    ollama pull qwen3.5:122b
    
    # 3. Clone and install
    git clone https://github.com/robit-man/open-agents.git && cd open-agents
    ./scripts/install.sh
    
    # 4. Use it
    oa "add pagination to the users endpoint"
    open-agents "refactor the auth module into separate files"

    Installation

    Prerequisites

    • Node.js >= 20
    • pnpm (npm install -g pnpm)
    • Ollama (ollama.com) with a model that supports tool calling

    Install System-Wide

    # Install to ~/.local/bin (no sudo needed)
    ./scripts/install.sh
    
    # Install to /usr/local/bin
    sudo ./scripts/install.sh --global
    
    # Custom prefix
    ./scripts/install.sh --prefix ~/bin
    
    # Uninstall
    ./scripts/install.sh --uninstall

    The installer will:

    1. Check Node.js and pnpm versions
    2. Install workspace dependencies
    3. Build all packages
    4. Create open-agents and oa symlinks
    5. Configure an optimized Ollama model (auto-detects RAM for context window sizing)

    Manual Build

    pnpm install
    pnpm -r build
    pnpm -r test   # 911 tests across 77 files

    Tools

    The agent has access to 26 tools that it calls autonomously:

    Tool Description
    file_read Read file contents with line numbers (supports offset/limit)
    file_write Create or overwrite files
    file_edit Precise string replacement in files (preferred over full rewrites)
    shell Execute any shell command (tests, builds, git, etc.)
    grep_search Search file contents with regex (uses ripgrep when available)
    find_files Find files by glob pattern
    list_directory List directory contents with types and sizes
    web_search Search the web via DuckDuckGo
    web_fetch Fetch and extract text from web pages (docs, MDN, w3schools)
    memory_read Read from persistent memory store
    memory_write Store patterns and solutions for future tasks
    aiwg_setup Deploy AIWG SDLC framework in the project
    aiwg_health Analyze project SDLC health and readiness
    aiwg_workflow Execute AIWG commands and workflows
    batch_edit Multiple precise edits across files in one call
    codebase_map High-level project structure overview
    diagnostic Run lint/typecheck/test/build validation pipeline
    git_info Structured git status, log, diff, and branch info
    background_run Run a shell command in the background (returns task ID)
    task_status Check status of background tasks
    task_output Read output from a background task
    task_stop Stop a running background task
    sub_agent Delegate a sub-task to an independent agent
    image_read Read image files (base64 + dimensions + OCR text)
    screenshot Capture screen or window to file
    ocr Extract text from images (supports region cropping/zoom)

    Parallel Execution & Sub-Agents

    The agent can run multiple operations in parallel:

    You: oa "run the test suite and lint checks in parallel, then fix any issues"
    
    Agent: [Turn 1] background_run(command="npm test")        → task-1
           [Turn 2] background_run(command="npm run lint")     → task-2
           [Turn 3] task_status()                              → task-1: running, task-2: completed
           [Turn 4] task_output(task_id="task-2")              → 3 lint errors
           [Turn 5] file_edit(...)                             → fix lint errors
           [Turn 6] task_output(task_id="task-1")              → all tests pass
           [Turn 7] task_complete(summary="Fixed lint, tests pass")

    Sub-agents can be delegated independent tasks:

    Agent: [Turn 1] sub_agent(task="refactor auth module", background=true)  → task-3
           [Turn 2] sub_agent(task="add pagination to users API")            → completed
           [Turn 3] task_output(task_id="task-3")                            → auth refactored

    Image & Visual Context

    Drag-and-drop image files onto the terminal to provide visual context:

    # Drop an image file path while agent is working → injected as context
    # Drop an image file path at idle prompt → agent describes and analyzes it

    The agent can also take screenshots and extract text via OCR:

    Agent: [Turn 1] screenshot(region="active")     → captured window
           [Turn 2] ocr(path="/tmp/screenshot.png")  → extracted text
           [Turn 3] image_read(path="mockup.png")    → base64 + OCR text

    Mid-Task Steering

    While the agent is working (shown by the + prompt), you can type to add context:

    > fix the auth bug
      ⎿  📄 Read: src/auth.ts
    + also check the session handling        ← typed while agent works
      ↪ Context added: also check the session handling
      ⎿  🔍 Search: session
      ⎿  ✏️  Edit: src/auth.ts

    Press Ctrl+C to abort the current task. Slash commands (/model, /help) work during active tasks.

    Self-Learning

    When the agent encounters an unfamiliar API or language feature, it automatically:

    1. Searches the web for documentation
    2. Fetches the relevant page (w3schools.com, MDN, official docs)
    3. Stores the learned pattern in persistent memory
    4. Applies the knowledge to the current task

    Error Recovery

    The agent follows an iterative fix loop:

    1. Run validation (tests/build/lint)
    2. Read the full error output
    3. Identify the exact file, line, and failure
    4. Fix with file_edit
    5. Re-run validation
    6. Repeat until passing

    Dynamic System Prompt

    The agent's system prompt is dynamically enriched at task start with:

    Source Description
    Project context files .open-agents.md, AGENTS.md, or .open-agents/context.md — loaded from project root and parent directories
    Git state Current branch, working tree status, recent commits
    Persistent memory Learned patterns from previous sessions (project-local and global)
    Environment Working directory, Node version, OS, date

    Create a .open-agents.md file in your project root to give the agent project-specific instructions:

    # Project Context
    
    - This is a TypeScript monorepo using pnpm workspaces
    - Run tests with: pnpm -r test
    - Build with: pnpm -r build
    - Always use file_edit over file_write for existing files
    - Database migrations are in src/db/migrations/

    Context files are merged from parent → child directories, so you can set global defaults at ~/.open-agents.md and override per-project.

    .oa/ Project Directory

    Each project gets a .oa/ directory (similar to .claude/ for Claude Code) that persists artifacts across sessions:

    .oa/
    ├── config.json              # Per-project configuration overrides
    ├── memory/                  # Persistent memory store
    │   └── {topic}.json         # Topic-based key-value memories
    ├── index/                   # Cached codebase index
    │   ├── repo-profile.json    # Repository metadata
    │   ├── file-summaries.json  # Per-file purpose, exports, domain, risk
    │   ├── symbols.json         # Symbol table cache
    │   ├── graph.json           # Import/dependency graph
    │   └── meta.json            # Index metadata (timestamp, hash)
    ├── context/                 # Auto-generated project context
    │   └── project-map.md       # Generated overview for system prompt
    └── history/                 # Session history
        └── {session-id}.json    # Per-session task log

    The agent auto-discovers AGENTS.md, OA.md, CLAUDE.md, and README.md from the project root and parent directories, injecting them into the system prompt for project-specific awareness.

    Smart Context Compaction

    When conversations exceed the context window, the agent compacts older messages while preserving:

    • Files that were read and modified
    • Shell commands that were run and their outcomes
    • Errors that were encountered
    • Key decisions that were made

    This structured summary prevents the agent from repeating work or losing track of what's been done.

    Commands

    Command Description
    oa "task" Run a coding task (short alias)
    open-agents "task" Run a coding task
    open-agents run "task" --repo /path Run against a specific repo
    open-agents index /path Index a repository
    open-agents status Show system status
    open-agents config Show/set configuration
    open-agents serve Start/verify backend server
    open-agents eval Run evaluation suite

    Flags

    -m, --model <name>         Model name (default: qwen3.5:122b)
    -b, --backend-url <url>    Backend URL (default: http://localhost:11434)
        --backend <type>       Backend type: ollama (default), vllm, fake
    -r, --repo <path>          Repository root (default: cwd)
        --dry-run              Show what would happen without writing files
        --offline              Skip backend health check
    -v, --verbose              Show model responses and debug info
        --timeout-ms <ms>      Per-request timeout (default: 300000)
    -h, --help                 Show help
    -V, --version              Show version

    Voice Feedback (TTS)

    The agent can speak what it's doing using neural TTS voices. Enable it in the interactive REPL:

    /voice              # Toggle voice on/off (default: GLaDOS)
    /voice glados       # Switch to GLaDOS voice
    /voice overwatch    # Switch to Overwatch voice

    On first enable, the agent auto-downloads the ONNX voice model (50MB) and installs onnxruntime-node in `/.open-agents/voice/. For best quality, install espeak-ng`:

    # Ubuntu/Debian
    sudo apt install espeak-ng
    
    # macOS
    brew install espeak-ng

    When enabled, the agent speaks brief descriptions of each tool call ("Reading auth.ts", "Running tests", "Editing config.js") through your system speakers.

    Configuration

    Config priority: CLI flags > environment variables > ~/.open-agents/config.json > defaults.

    # Set defaults
    open-agents config set model qwen3.5:122b
    open-agents config set backendUrl http://localhost:11434
    open-agents config set backendType ollama
    
    # Environment variables
    export OPEN_AGENTS_MODEL=qwen3.5:122b
    export OPEN_AGENTS_BACKEND_URL=http://localhost:11434
    export OPEN_AGENTS_BACKEND_TYPE=ollama

    Model Support

    Primary target: Qwen3.5-122B-A10B via Ollama (MoE, runs on 48GB+ VRAM)

    The setup-model.sh script auto-configures the context window based on available RAM:

    RAM Context Window
    300GB+ 128K tokens
    128GB+ 64K tokens
    64GB+ 32K tokens
    < 64GB 16K tokens

    Other Models

    Any model that supports tool calling via Ollama or an OpenAI-compatible API works:

    # Use a different Ollama model
    oa --model qwen2.5-coder:32b "fix the bug"
    
    # Use vLLM backend
    oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
    
    # Use any OpenAI-compatible API
    oa --backend-url http://10.0.0.5:11434 "refactor auth"

    AIWG Integration

    Open Agents integrates with AIWG (AI Writing Guide) — a cognitive architecture for AI-augmented software development. When AIWG is installed, the agent gains SDLC superpowers:

    # Install AIWG globally
    npm i -g aiwg
    
    # The agent can now use AIWG tools automatically:
    oa "analyze this project's SDLC health and set up proper documentation"
    oa "create requirements and architecture docs for this codebase"

    What AIWG Adds

    Capability Description
    Structured Memory .aiwg/ directory persists project knowledge across sessions
    SDLC Artifacts Requirements, architecture, test strategy, deployment docs
    Health Analysis Score your project's SDLC maturity (testing, CI/CD, docs, etc.)
    85+ Agents Specialized AI personas (Test Engineer, Security Auditor, API Designer)
    Traceability @-mention system links requirements → code → tests

    AIWG Tools

    The 3 AIWG tools are available when aiwg is installed globally:

    • aiwg_setup — Deploy an AIWG framework (sdlc, marketing, forensics, research)
    • aiwg_health — Analyze project SDLC readiness (works even without AIWG installed)
    • aiwg_workflow — Run any AIWG CLI command (runtime-info, list, mcp info)

    If AIWG is not installed, the tools return helpful install instructions. The aiwg_health tool provides native analysis without requiring AIWG.

    Architecture

    Agentic Loop

    The core is AgenticRunner — a multi-turn tool-calling loop:

    User task
        ↓
    System prompt + tools → LLM
        ↓
    LLM returns tool_calls → Execute tools → Feed results back → LLM
        ↓  (repeat until task_complete or max turns)
    Result: completed/incomplete, turns, tool calls, duration

    Key design decisions:

    • Tool-first: The model explores via tools rather than pre-stuffed context
    • Iterative: Tests, sees failures, fixes them — no need for perfect one-shot output
    • Context compaction: Long conversations are compressed, preserving only recent context
    • Bounded: Maximum turns, timeout, and output limits prevent runaway loops
    • Observable: Every tool call and result is emitted as a real-time event

    Package Structure

    packages/
      orchestrator/   - AgenticRunner, OllamaAgenticBackend, RALPH loop
      execution/      - 11 tools (file, shell, grep, web, memory), validation pipeline
      schemas/        - Zod schemas and TypeScript types
      backend-vllm/   - Ollama + vLLM backend clients (OpenAI-compatible)
      memory/         - SQLite-backed persistent memory stores
      indexer/        - Codebase scanning and symbol extraction
      retrieval/      - Multi-stage retrieval (lexical + semantic + graph)
      prompts/        - Prompt contracts for each agent role
      cli/            - CLI entry point, commands, config, UI
    
    apps/
      api/            - Express API server
      worker/         - Background task processor
    
    eval/             - 8 evaluation tasks with agentic runner
    scripts/          - install.sh, setup-model.sh, bootstrap.sh

    Evaluation

    The framework includes 17 evaluation tasks that test the agent's ability to autonomously resolve coding problems:

    # Run all 8 tasks with agentic tool-calling loop
    node eval/run-agentic.mjs
    
    # Single task
    node eval/run-agentic.mjs 04-add-test
    
    # Different model
    node eval/run-agentic.mjs --model qwen2.5-coder:32b

    Results (Qwen3.5-122B)

    TASK                 RESULT   TIME       TURNS    TOOLS
    01-fix-typo          PASS     39.1s      4        7
    02-add-function      PASS     24.5s      4        5
    03-fix-bug           PASS     26.9s      4        5
    04-add-test          PASS     198.1s     6        8
    05-refactor          PASS     73.1s      4        5
    06-type-error        PASS     143.2s     5        7
    07-add-endpoint      PASS     40.0s      4        5
    08-multi-file        PASS     75.5s      8        13
    
    Pass rate: 100% (8/8)
    Total: 39 turns, 55 tool calls, ~10 minutes

    Task Descriptions

    ID Task Difficulty
    01 Fix typo in function name Easy
    02 Add isPrime function Easy
    03 Fix off-by-one bug Easy
    04 Write comprehensive tests for untested functions Medium
    05 Extract functions from long method (refactor) Medium
    06 Fix TypeScript type errors Medium
    07 Add REST API endpoint Medium
    08 Add pagination across multiple files Hard
    09 CSS named color lookup (148 colors, web search) Medium
    10 HTTP status code lookup (32+ codes, web search) Medium
    11 MIME type lookup (30+ types, web search) Medium
    12 SDLC health analyzer (AIWG-style scoring) Medium
    13 SDLC artifact generator (requirements, arch, tests) Hard
    14 Batch refactor variable names across files Medium
    15 Codebase overview generator from structure analysis Medium
    16 Diagnostic fix loop (find and fix buggy code) Medium
    17 Git repository analyzer Medium

    Test Suite

    Package          Tests
    ─────────────────────────
    schemas          216
    backend-vllm     162
    execution        136
    indexer            94
    cli                72
    orchestrator       70
    retrieval          66
    memory             58
    prompts            34
    apps/api            1
    apps/worker         2
    ─────────────────────────
    Total             911 passing

    Development

    pnpm install          # Install dependencies
    pnpm -r build         # Build all packages
    pnpm -r test          # Run all 911 tests
    pnpm -r dev           # Watch mode

    License

    MIT