Package Exports

open-agents-ai
open-agents-ai/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (open-agents-ai) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

npm version npm downloads

freedom of information · freedom of patterns · creating freely · open-weights
libertad de informacion · crear libremente · creer librement · liberte d'expression
Freiheit der Muster · jiyuu ni souzou suru · jayuroun changjak · svoboda tvorchestva
liberdade de criar · creare liberamente · ozgurce yarat · skapa fritt
vrij creeren · tworz swobodnie · dimiourgia elefthera · khuli soch
hurriyat al-ibdaa · code is poetry · democratize AI · imagine freely

Open Agents

npm i -g open-agents-ai && oa

AI coding agent powered entirely by open-weight models. No API keys. No cloud. Your code never leaves your machine.

An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.

Features

26 autonomous tools — file I/O, shell, grep, web search/fetch, memory, sub-agents, background tasks, image/OCR, git, diagnostics
Parallel tool execution — read-only tools run concurrently via Promise.allSettled
Sub-agent delegation — spawn independent agents for parallel workstreams
Ralph Loop — iterative task execution that keeps retrying until completion criteria are met
Dream Mode — creative idle exploration modeled after real sleep architecture (NREM→REM cycles)
Live Listen — bidirectional voice communication with real-time Whisper transcription
Neural TTS — hear what the agent is doing via GLaDOS or Overwatch ONNX voices
Auto-expanding context — detects RAM/VRAM and creates an optimized model variant on first run
Mid-task steering — type while the agent works to add context without interrupting
Smart compaction — long conversations compressed preserving files, commands, errors, decisions
Persistent memory — learned patterns stored in .oa/memory/ across sessions
Self-learning — auto-fetches docs from the web when encountering unfamiliar APIs
Seamless /update — in-place update and reload without losing context

How It Works

You: oa "fix the null check in auth.ts"

Agent: [Turn 1] file_read(src/auth.ts)
       [Turn 2] grep_search(pattern="null", path="src/auth.ts")
       [Turn 3] file_edit(old_string="if (user)", new_string="if (user != null)")
       [Turn 4] shell(command="npm test")
       [Turn 5] task_complete(summary="Fixed null check — all tests pass")

The agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.

Ralph Loop — Iteration-First Design

The Ralph Loop is the core execution philosophy: iteration beats perfection. Instead of trying to get everything right on the first attempt, the agent executes in a retry loop where errors become learning data rather than session-ending failures.

/ralph "fix all failing tests" --completion "npm test passes with 0 failures"
/ralph "migrate to TypeScript" --completion "npx tsc --noEmit exits 0" --max-iterations 20
/ralph "reach 80% coverage" --completion "coverage report shows >80%" --timeout 120

Each iteration:

Execute — make changes based on the task + all accumulated learnings
Verify — run the completion command (tests, build, lint, coverage)
Learn — if verification fails, extract what went wrong and why
Iterate — retry with the new knowledge until passing or limits reached

The loop tracks iteration history, generates completion reports saved to .aiwg/ralph/, and supports resume/abort for interrupted sessions. Safety bounds (max iterations, timeout) prevent runaway loops.

/ralph-status     # Check current/previous loop status
/ralph-resume     # Resume interrupted loop
/ralph-abort      # Cancel running loop

Dream Mode — Creative Idle Exploration

When you're not actively tasking the agent, Dream Mode lets it creatively explore your codebase and generate improvement proposals autonomously. The system models real human sleep architecture with four stages per cycle:

Stage	Name	What Happens
NREM-1	Light Scan	Quick codebase overview, surface observations
NREM-2	Pattern Detection	Identify recurring patterns, technical debt, gaps
NREM-3	Deep Consolidation	Synthesize findings into structured proposals
REM	Creative Expansion	Novel ideas, cross-domain connections, bold plans

Each cycle expands through all four stages then contracts (evaluation, pruning of weak ideas). Three modes control how far the agent can go:

/dream              # Default — read-only exploration, proposals saved to .oa/dreams/
/dream deep         # Multi-cycle deep exploration with expansion/contraction phases
/dream lucid        # Full implementation — saves workspace backup, then implements,
                    #   tests, evaluates, and self-plays each proposal with checkpoints
/dream stop         # Wake up — stop dreaming

Default and Deep modes are completely safe — the agent can only read your code and write proposals to .oa/dreams/. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.

Lucid mode unlocks full write access. Before making changes, it saves a workspace checkpoint so you can roll back. Each cycle goes: dream → implement → test → evaluate → checkpoint → next cycle.

All proposals are indexed in .oa/dreams/PROPOSAL-INDEX.md for easy review.

Listen Mode — Live Bidirectional Audio

Listen mode enables real-time voice communication with the agent. Your microphone audio is captured, streamed through Whisper (via transcribe-cli), and the transcription is injected directly into the input line — creating a hands-free coding workflow.

/listen             # Toggle microphone capture on/off
/listen auto        # Auto-submit after 3 seconds of silence (hands-free)
/listen confirm     # Require Enter to submit transcription (default)
/listen stop        # Stop listening

Model selection — choose the Whisper model size for your hardware:

/listen tiny        # Fastest, least accurate (~39MB)
/listen base        # Good balance (~74MB)
/listen small       # Better accuracy (~244MB)
/listen medium      # High accuracy (~769MB)
/listen large       # Best accuracy, slower (~1.5GB)

When combined with /voice, you get full bidirectional audio — speak your tasks, hear the agent's progress through TTS, and speak corrections mid-task. The status bar shows a blinking red ● REC indicator with a countdown timer during auto-mode recording.

Platform support:

Linux: arecord (ALSA) or ffmpeg (PulseAudio)
macOS: sox (CoreAudio) or ffmpeg (AVFoundation)

The transcribe-cli dependency auto-installs in the background on first use.

File transcription: Drag-and-drop audio/video files (.mp3, .wav, .mp4, .mkv, etc.) onto the terminal to transcribe them. Results are saved to .oa/transcripts/.

Interactive TUI

Launch without arguments to enter the interactive REPL:

oa

The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.

Slash Commands

Command	Description
`/help`	Show all available commands
`/model <name>`	Switch to a different Ollama model
`/endpoint <url>`	Connect to a remote vLLM or OpenAI-compatible API
`/voice [model]`	Toggle TTS voice (GLaDOS, Overwatch)
`/listen [mode]`	Toggle live microphone transcription
`/dream [mode]`	Start dream mode (default, deep, lucid)
`/stream`	Toggle streaming token display
`/bruteforce`	Toggle brute-force mode (auto re-engage on turn limit)
`/tools`	List available tools
`/skills`	List/search available skills
`/update`	Check for and install updates (seamless reload)
`/config`	Show current configuration
`/clear`	Clear the screen
`/exit`	Quit

Mid-Task Steering

While the agent is working (shown by the + prompt), type to add context:

> fix the auth bug
  ⎿  Read: src/auth.ts
+ also check the session handling        ← typed while agent works
  ↪ Context added: also check the session handling
  ⎿  Search: session
  ⎿  Edit: src/auth.ts

Tools (26)

Tool	Description
`file_read`	Read file contents with line numbers (offset/limit)
`file_write`	Create or overwrite files
`file_edit`	Precise string replacement in files
`shell`	Execute any shell command
`grep_search`	Search file contents with regex (ripgrep)
`find_files`	Find files by glob pattern
`list_directory`	List directory contents
`web_search`	Search the web via DuckDuckGo
`web_fetch`	Fetch and extract text from web pages
`memory_read`	Read from persistent memory store
`memory_write`	Store patterns for future sessions
`batch_edit`	Multiple edits across files in one call
`codebase_map`	High-level project structure overview
`diagnostic`	Lint/typecheck/test/build validation pipeline
`git_info`	Structured git status, log, diff, branch info
`background_run`	Run shell command in background
`task_status`	Check background task status
`task_output`	Read background task output
`task_stop`	Stop a background task
`sub_agent`	Delegate to an independent agent
`image_read`	Read images (base64 + OCR)
`screenshot`	Capture screen/window
`ocr`	Extract text from images
`aiwg_setup`	Deploy AIWG SDLC framework
`aiwg_health`	Analyze SDLC health
`aiwg_workflow`	Execute AIWG workflows

Read-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.

Auto-Expanding Context Window

On startup and /model switch, Open Agents detects your RAM/VRAM and creates an optimized model variant:

Available Memory	Context Window
200GB+	128K tokens
100GB+	64K tokens
50GB+	32K tokens
20GB+	16K tokens
8GB+	8K tokens
< 8GB	4K tokens

Voice Feedback (TTS)

/voice              # Toggle on/off (default: GLaDOS)
/voice glados       # GLaDOS voice
/voice overwatch    # Overwatch voice

Auto-downloads the ONNX voice model (~50MB) on first use. Install espeak-ng for best quality (apt install espeak-ng / brew install espeak-ng).

Configuration

Config priority: CLI flags > env vars > ~/.open-agents/config.json > defaults.

open-agents config set model qwen3.5:122b
open-agents config set backendUrl http://localhost:11434

Project Context

Create AGENTS.md, OA.md, or .open-agents.md in your project root for agent instructions. Context files merge from parent to child directories.

`.oa/` Project Directory

.oa/
├── config.json        # Project config overrides
├── settings.json      # TUI settings
├── memory/            # Persistent memory store
├── dreams/            # Dream mode proposals & checkpoints
├── transcripts/       # Audio/video transcriptions
├── index/             # Cached codebase index
├── context/           # Auto-generated project context
└── history/           # Session history

Model Support

Primary target: Qwen3.5-122B-A10B via Ollama (MoE, 48GB+ VRAM)

Any Ollama or OpenAI-compatible API model with tool calling works:

oa --model qwen2.5-coder:32b "fix the bug"
oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
oa --backend-url http://10.0.0.5:11434 "refactor auth"

Evaluation Suite

23 evaluation tasks test the agent's autonomous capabilities across coding, web research, SDLC analysis, and tool creation:

node eval/run-agentic.mjs                          # Run all 23 tasks
node eval/run-agentic.mjs 04-add-test              # Single task
node eval/run-agentic.mjs --model qwen2.5-coder:32b  # Different model

ID	Task	Category
01	Fix typo in function name	Code Fix
02	Add isPrime function	Code Generation
03	Fix off-by-one bug	Code Fix
04	Write comprehensive tests	Test Generation
05	Extract functions from long method	Refactoring
06	Fix TypeScript type errors	Type Safety
07	Add REST API endpoint	Feature Addition
08	Add pagination across files	Multi-File Edit
09	CSS named color lookup (148 colors)	Web Research
10	HTTP status code lookup (32+ codes)	Web Research
11	MIME type lookup (30+ types)	Web Research
12	SDLC health analyzer	AIWG Analysis
13	SDLC artifact generator	AIWG Generation
14	Batch refactor variable names	Multi-File Refactor
15	Codebase overview from structure	Code Analysis
16	Diagnostic fix loop	Error Recovery
17	Git repository analyzer	Git Integration
18	Create custom tool from spec	Tool Creation
19	Tool from usage pattern	Tool Discovery
20	Tool management operations	Tool Lifecycle
21	Large file patch	Precision Editing
22	Skill discovery	Skill System
23	Skill execution	Skill System

Benchmark Results (Qwen3.5-122B)

Pass rate: 100% (8/8 core tasks)
Total: 39 turns, 55 tool calls, ~10 minutes
Average: 4.9 turns/task, 6.9 tools/task

AIWG Integration

Open Agents integrates with AIWG for AI-augmented software development:

npm i -g aiwg
oa "analyze this project's SDLC health and set up documentation"

Capability	Description
Structured Memory	`.aiwg/` directory persists project knowledge
SDLC Artifacts	Requirements, architecture, test strategy, deployment docs
Health Analysis	Score your project's SDLC maturity
85+ Agents	Specialized AI personas (Test Engineer, Security Auditor, API Designer)
Traceability	@-mention system links requirements to code to tests

Architecture

The core is AgenticRunner — a multi-turn tool-calling loop:

User task → System prompt + tools → LLM → tool_calls → Execute → Feed results → LLM
                                          (repeat until task_complete or max turns)

Tool-first — the model explores via tools, not pre-stuffed context
Iterative — tests, sees failures, fixes them
Parallel-safe — read-only tools concurrent, mutating tools sequential
Observable — every tool call and result emitted as a real-time event
Bounded — max turns, timeout, output limits prevent runaway loops

License

MIT

open-agents-ai