Package Exports

open-agents-ai
open-agents-ai/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (open-agents-ai) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

Open Agents

AI coding agent framework powered by open-weight models via Ollama.

A multi-turn agentic tool-calling loop that iteratively reads code, makes changes, runs tests, and fixes failures until the task is complete — modeled after how Claude Code operates, but running entirely on local open-weight models.

How It Works

You: oa "fix the null check in auth.ts"

Agent: [Turn 1] file_read(src/auth.ts)
       [Turn 2] grep_search(pattern="null", path="src/auth.ts")
       [Turn 3] file_edit(old_string="if (user)", new_string="if (user != null)")
       [Turn 4] shell(command="npm test")
       [Turn 5] task_complete(summary="Fixed null check — all tests pass")

The agent has 18 tools (including 3 AIWG SDLC tools and 4 advanced analysis tools) and uses them autonomously in a loop, reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.

Quick Start

Install from npm (recommended)

# Install globally — provides `open-agents` and `oa` commands
npm i -g open-agents-ai

# Run it — first launch auto-detects your system and pulls the best model
oa "fix the null check in auth.ts"

On first run, the setup wizard detects your RAM/VRAM and recommends the optimal qwen3.5 variant.

Install from source

# 1. Install Ollama (https://ollama.com)
curl -fsSL https://ollama.com/install.sh | sh

# 2. Pull the model
ollama pull qwen3.5:122b

# 3. Clone and install
git clone https://github.com/robit-man/open-agents.git && cd open-agents
./scripts/install.sh

# 4. Use it
oa "add pagination to the users endpoint"
open-agents "refactor the auth module into separate files"

Installation

Prerequisites

Node.js >= 20
pnpm (npm install -g pnpm)
Ollama (ollama.com) with a model that supports tool calling

Install System-Wide

# Install to ~/.local/bin (no sudo needed)
./scripts/install.sh

# Install to /usr/local/bin
sudo ./scripts/install.sh --global

# Custom prefix
./scripts/install.sh --prefix ~/bin

# Uninstall
./scripts/install.sh --uninstall

The installer will:

Check Node.js and pnpm versions
Install workspace dependencies
Build all packages
Create open-agents and oa symlinks
Configure an optimized Ollama model (auto-detects RAM for context window sizing)

Manual Build

pnpm install
pnpm -r build
pnpm -r test   # 911 tests across 77 files

Tools

The agent has access to 26 tools that it calls autonomously:

Tool	Description
`file_read`	Read file contents with line numbers (supports offset/limit)
`file_write`	Create or overwrite files
`file_edit`	Precise string replacement in files (preferred over full rewrites)
`shell`	Execute any shell command (tests, builds, git, etc.)
`grep_search`	Search file contents with regex (uses ripgrep when available)
`find_files`	Find files by glob pattern
`list_directory`	List directory contents with types and sizes
`web_search`	Search the web via DuckDuckGo
`web_fetch`	Fetch and extract text from web pages (docs, MDN, w3schools)
`memory_read`	Read from persistent memory store
`memory_write`	Store patterns and solutions for future tasks
`aiwg_setup`	Deploy AIWG SDLC framework in the project
`aiwg_health`	Analyze project SDLC health and readiness
`aiwg_workflow`	Execute AIWG commands and workflows
`batch_edit`	Multiple precise edits across files in one call
`codebase_map`	High-level project structure overview
`diagnostic`	Run lint/typecheck/test/build validation pipeline
`git_info`	Structured git status, log, diff, and branch info
`background_run`	Run a shell command in the background (returns task ID)
`task_status`	Check status of background tasks
`task_output`	Read output from a background task
`task_stop`	Stop a running background task
`sub_agent`	Delegate a sub-task to an independent agent
`image_read`	Read image files (base64 + dimensions + OCR text)
`screenshot`	Capture screen or window to file
`ocr`	Extract text from images (supports region cropping/zoom)

Parallel Execution & Sub-Agents

The agent can run multiple operations in parallel:

You: oa "run the test suite and lint checks in parallel, then fix any issues"

Agent: [Turn 1] background_run(command="npm test")        → task-1
       [Turn 2] background_run(command="npm run lint")     → task-2
       [Turn 3] task_status()                              → task-1: running, task-2: completed
       [Turn 4] task_output(task_id="task-2")              → 3 lint errors
       [Turn 5] file_edit(...)                             → fix lint errors
       [Turn 6] task_output(task_id="task-1")              → all tests pass
       [Turn 7] task_complete(summary="Fixed lint, tests pass")

Sub-agents can be delegated independent tasks:

Agent: [Turn 1] sub_agent(task="refactor auth module", background=true)  → task-3
       [Turn 2] sub_agent(task="add pagination to users API")            → completed
       [Turn 3] task_output(task_id="task-3")                            → auth refactored

Image & Visual Context

Drag-and-drop image files onto the terminal to provide visual context:

# Drop an image file path while agent is working → injected as context
# Drop an image file path at idle prompt → agent describes and analyzes it

The agent can also take screenshots and extract text via OCR:

Agent: [Turn 1] screenshot(region="active")     → captured window
       [Turn 2] ocr(path="/tmp/screenshot.png")  → extracted text
       [Turn 3] image_read(path="mockup.png")    → base64 + OCR text

Mid-Task Steering

While the agent is working (shown by the + prompt), you can type to add context:

> fix the auth bug
  ⎿  📄 Read: src/auth.ts
+ also check the session handling        ← typed while agent works
  ↪ Context added: also check the session handling
  ⎿  🔍 Search: session
  ⎿  ✏️  Edit: src/auth.ts

Press Ctrl+C to abort the current task. Slash commands (/model, /help) work during active tasks.

Self-Learning

When the agent encounters an unfamiliar API or language feature, it automatically:

Searches the web for documentation
Fetches the relevant page (w3schools.com, MDN, official docs)
Stores the learned pattern in persistent memory
Applies the knowledge to the current task

Error Recovery

The agent follows an iterative fix loop:

Run validation (tests/build/lint)
Read the full error output
Identify the exact file, line, and failure
Fix with file_edit
Re-run validation
Repeat until passing

Dynamic System Prompt

The agent's system prompt is dynamically enriched at task start with:

Source	Description
Project context files	`.open-agents.md`, `AGENTS.md`, or `.open-agents/context.md` — loaded from project root and parent directories
Git state	Current branch, working tree status, recent commits
Persistent memory	Learned patterns from previous sessions (project-local and global)
Environment	Working directory, Node version, OS, date

Create a .open-agents.md file in your project root to give the agent project-specific instructions:

# Project Context

- This is a TypeScript monorepo using pnpm workspaces
- Run tests with: pnpm -r test
- Build with: pnpm -r build
- Always use file_edit over file_write for existing files
- Database migrations are in src/db/migrations/

Context files are merged from parent → child directories, so you can set global defaults at ~/.open-agents.md and override per-project.

`.oa/` Project Directory

Each project gets a .oa/ directory (similar to .claude/ for Claude Code) that persists artifacts across sessions:

.oa/
├── config.json              # Per-project configuration overrides
├── memory/                  # Persistent memory store
│   └── {topic}.json         # Topic-based key-value memories
├── index/                   # Cached codebase index
│   ├── repo-profile.json    # Repository metadata
│   ├── file-summaries.json  # Per-file purpose, exports, domain, risk
│   ├── symbols.json         # Symbol table cache
│   ├── graph.json           # Import/dependency graph
│   └── meta.json            # Index metadata (timestamp, hash)
├── context/                 # Auto-generated project context
│   └── project-map.md       # Generated overview for system prompt
└── history/                 # Session history
    └── {session-id}.json    # Per-session task log

The agent auto-discovers AGENTS.md, OA.md, CLAUDE.md, and README.md from the project root and parent directories, injecting them into the system prompt for project-specific awareness.

Smart Context Compaction

When conversations exceed the context window, the agent compacts older messages while preserving:

Files that were read and modified
Shell commands that were run and their outcomes
Errors that were encountered
Key decisions that were made

This structured summary prevents the agent from repeating work or losing track of what's been done.

Commands

Command	Description
`oa "task"`	Run a coding task (short alias)
`open-agents "task"`	Run a coding task
`open-agents run "task" --repo /path`	Run against a specific repo
`open-agents index /path`	Index a repository
`open-agents status`	Show system status
`open-agents config`	Show/set configuration
`open-agents serve`	Start/verify backend server
`open-agents eval`	Run evaluation suite

Flags

-m, --model <name>         Model name (default: qwen3.5:122b)
-b, --backend-url <url>    Backend URL (default: http://localhost:11434)
    --backend <type>       Backend type: ollama (default), vllm, fake
-r, --repo <path>          Repository root (default: cwd)
    --dry-run              Show what would happen without writing files
    --offline              Skip backend health check
-v, --verbose              Show model responses and debug info
    --timeout-ms <ms>      Per-request timeout (default: 300000)
-h, --help                 Show help
-V, --version              Show version

Voice Feedback (TTS)

The agent can speak what it's doing using neural TTS voices. Enable it in the interactive REPL:

/voice              # Toggle voice on/off (default: GLaDOS)
/voice glados       # Switch to GLaDOS voice
/voice overwatch    # Switch to Overwatch voice

On first enable, the agent auto-downloads the ONNX voice model (~~50MB) and installs onnxruntime-node in `~~/.open-agents/voice/. For best quality, install espeak-ng`:

# Ubuntu/Debian
sudo apt install espeak-ng

# macOS
brew install espeak-ng

When enabled, the agent speaks brief descriptions of each tool call ("Reading auth.ts", "Running tests", "Editing config.js") through your system speakers.

Configuration

Config priority: CLI flags > environment variables > ~/.open-agents/config.json > defaults.

# Set defaults
open-agents config set model qwen3.5:122b
open-agents config set backendUrl http://localhost:11434
open-agents config set backendType ollama

# Environment variables
export OPEN_AGENTS_MODEL=qwen3.5:122b
export OPEN_AGENTS_BACKEND_URL=http://localhost:11434
export OPEN_AGENTS_BACKEND_TYPE=ollama

Model Support

Primary target: Qwen3.5-122B-A10B via Ollama (MoE, runs on 48GB+ VRAM)

The setup-model.sh script auto-configures the context window based on available RAM:

RAM	Context Window
300GB+	128K tokens
128GB+	64K tokens
64GB+	32K tokens
< 64GB	16K tokens

Other Models

Any model that supports tool calling via Ollama or an OpenAI-compatible API works:

# Use a different Ollama model
oa --model qwen2.5-coder:32b "fix the bug"

# Use vLLM backend
oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"

# Use any OpenAI-compatible API
oa --backend-url http://10.0.0.5:11434 "refactor auth"

AIWG Integration

Open Agents integrates with AIWG (AI Writing Guide) — a cognitive architecture for AI-augmented software development. When AIWG is installed, the agent gains SDLC superpowers:

# Install AIWG globally
npm i -g aiwg

# The agent can now use AIWG tools automatically:
oa "analyze this project's SDLC health and set up proper documentation"
oa "create requirements and architecture docs for this codebase"

What AIWG Adds

Capability	Description
Structured Memory	`.aiwg/` directory persists project knowledge across sessions
SDLC Artifacts	Requirements, architecture, test strategy, deployment docs
Health Analysis	Score your project's SDLC maturity (testing, CI/CD, docs, etc.)
85+ Agents	Specialized AI personas (Test Engineer, Security Auditor, API Designer)
Traceability	@-mention system links requirements → code → tests

AIWG Tools

The 3 AIWG tools are available when aiwg is installed globally:

aiwg_setup — Deploy an AIWG framework (sdlc, marketing, forensics, research)
aiwg_health — Analyze project SDLC readiness (works even without AIWG installed)
aiwg_workflow — Run any AIWG CLI command (runtime-info, list, mcp info)

If AIWG is not installed, the tools return helpful install instructions. The aiwg_health tool provides native analysis without requiring AIWG.

Architecture

Agentic Loop

The core is AgenticRunner — a multi-turn tool-calling loop:

User task
    ↓
System prompt + tools → LLM
    ↓
LLM returns tool_calls → Execute tools → Feed results back → LLM
    ↓  (repeat until task_complete or max turns)
Result: completed/incomplete, turns, tool calls, duration

Key design decisions:

Tool-first: The model explores via tools rather than pre-stuffed context
Iterative: Tests, sees failures, fixes them — no need for perfect one-shot output
Context compaction: Long conversations are compressed, preserving only recent context
Bounded: Maximum turns, timeout, and output limits prevent runaway loops
Observable: Every tool call and result is emitted as a real-time event

Package Structure

packages/
  orchestrator/   - AgenticRunner, OllamaAgenticBackend, RALPH loop
  execution/      - 11 tools (file, shell, grep, web, memory), validation pipeline
  schemas/        - Zod schemas and TypeScript types
  backend-vllm/   - Ollama + vLLM backend clients (OpenAI-compatible)
  memory/         - SQLite-backed persistent memory stores
  indexer/        - Codebase scanning and symbol extraction
  retrieval/      - Multi-stage retrieval (lexical + semantic + graph)
  prompts/        - Prompt contracts for each agent role
  cli/            - CLI entry point, commands, config, UI

apps/
  api/            - Express API server
  worker/         - Background task processor

eval/             - 8 evaluation tasks with agentic runner
scripts/          - install.sh, setup-model.sh, bootstrap.sh

Evaluation

The framework includes 17 evaluation tasks that test the agent's ability to autonomously resolve coding problems:

# Run all 8 tasks with agentic tool-calling loop
node eval/run-agentic.mjs

# Single task
node eval/run-agentic.mjs 04-add-test

# Different model
node eval/run-agentic.mjs --model qwen2.5-coder:32b

Results (Qwen3.5-122B)

TASK                 RESULT   TIME       TURNS    TOOLS
01-fix-typo          PASS     39.1s      4        7
02-add-function      PASS     24.5s      4        5
03-fix-bug           PASS     26.9s      4        5
04-add-test          PASS     198.1s     6        8
05-refactor          PASS     73.1s      4        5
06-type-error        PASS     143.2s     5        7
07-add-endpoint      PASS     40.0s      4        5
08-multi-file        PASS     75.5s      8        13

Pass rate: 100% (8/8)
Total: 39 turns, 55 tool calls, ~10 minutes

Task Descriptions

ID	Task	Difficulty
01	Fix typo in function name	Easy
02	Add isPrime function	Easy
03	Fix off-by-one bug	Easy
04	Write comprehensive tests for untested functions	Medium
05	Extract functions from long method (refactor)	Medium
06	Fix TypeScript type errors	Medium
07	Add REST API endpoint	Medium
08	Add pagination across multiple files	Hard
09	CSS named color lookup (148 colors, web search)	Medium
10	HTTP status code lookup (32+ codes, web search)	Medium
11	MIME type lookup (30+ types, web search)	Medium
12	SDLC health analyzer (AIWG-style scoring)	Medium
13	SDLC artifact generator (requirements, arch, tests)	Hard
14	Batch refactor variable names across files	Medium
15	Codebase overview generator from structure analysis	Medium
16	Diagnostic fix loop (find and fix buggy code)	Medium
17	Git repository analyzer	Medium

Test Suite

Package          Tests
─────────────────────────
schemas          216
backend-vllm     162
execution        136
indexer            94
cli                72
orchestrator       70
retrieval          66
memory             58
prompts            34
apps/api            1
apps/worker         2
─────────────────────────
Total             911 passing

Development

pnpm install          # Install dependencies
pnpm -r build         # Build all packages
pnpm -r test          # Run all 911 tests
pnpm -r dev           # Watch mode

License

MIT

open-agents-ai