Package Exports

agentic-browser-cli
agentic-browser-cli/src/cli.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (agentic-browser-cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

AI Browser CLI

An AI-powered browser automation CLI that accepts natural-language queries and executes web tasks autonomously — fully local, no cloud APIs required.

How it works

User Query (CLI)
      │
      ▼
Provider Selection  ←  ollama | anthropic | openai | azure | bedrock | vertexai | groq
      │
      ▼
LangChain ReAct Agent
      │
      ├── LLM Layer (chosen provider)
      │     ├── Ollama          (local, no API key)
      │     ├── Anthropic Claude
      │     ├── OpenAI
      │     ├── Azure OpenAI
      │     ├── AWS Bedrock
      │     ├── Google Vertex AI
      │     └── Groq
      │
      ├── Tools Layer
      │     ├── Primary  : @playwright/mcp  (MCP subprocess)
      │     └── Fallback : Direct Playwright (in-process)
      │
      └── Session Memory       ← optional context carry-over

Prerequisites

Requirement	Version	Notes
Node.js	≥ 18	nodejs.org
At least one LLM provider — choose any:
Ollama (local)	any	`ollama serve` + `ollama pull llama3`
Anthropic Claude	—	`ANTHROPIC_API_KEY` in `.env`
OpenAI	—	`OPENAI_API_KEY` in `.env`
Azure OpenAI	—	`AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_ENDPOINT`
AWS Bedrock	—	AWS credentials in `.env` or IAM role
Google Vertex AI	—	`GOOGLE_CLOUD_PROJECT` + `gcloud auth`
Groq	—	`GROQ_API_KEY` in `.env`

Installation

# 1. Install npm dependencies
npm install

# 2. Install the Chromium browser binary used by Playwright
npm run install:browsers

# 3. Copy and edit environment variables
cp .env.example .env

Optional: global install

npm link
# then call: ai-browser "…"

Configuration (`.env`)

Copy .env.example to .env and fill in the values for the provider(s) you want to use.

Common settings

Variable	Default	Description
`DEFAULT_PROVIDER`	`ollama`	Provider used when `--provider` is omitted
`HEADLESS`	`false`	Set `true` to run the browser without a visible window
`BROWSER_TIMEOUT`	`30000`	Timeout (ms) for each browser action
`MAX_ITERATIONS`	`50`	Maximum agent reasoning steps per query
`AGENT_TIMEOUT`	`300000`	Hard timeout (ms) for the full agent run
`MEMORY_DIR`	`.memory`	Folder where session data is stored
`DEBUG`	`false`	Set `true` to enable verbose output

Ollama (local)

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server endpoint
`DEFAULT_MODEL`	`llama3`	Default model when `--model` is omitted

Anthropic Claude

Variable	Description
`ANTHROPIC_API_KEY`	API key from console.anthropic.com

OpenAI

Variable	Description
`OPENAI_API_KEY`	API key from platform.openai.com

Azure OpenAI

Variable	Description
`AZURE_OPENAI_API_KEY`	Azure resource API key
`AZURE_OPENAI_ENDPOINT`	`https://<resource>.openai.azure.com/`
`AZURE_OPENAI_DEPLOYMENT`	Deployment / model name
`AZURE_OPENAI_API_VERSION`	API version (default `2024-10-21`)

AWS Bedrock

Variable	Description
`AWS_ACCESS_KEY_ID`	AWS access key (or use IAM role)
`AWS_SECRET_ACCESS_KEY`	AWS secret key
`AWS_SESSION_TOKEN`	Optional session token
`AWS_REGION`	Region (default `us-east-1`)

Google Vertex AI

Variable	Description
`GOOGLE_CLOUD_PROJECT`	GCP project ID
`GOOGLE_CLOUD_LOCATION`	Region (default `us-central1`)

Run gcloud auth application-default login before using Vertex AI.

Groq

Variable	Description
`GROQ_API_KEY`	API key from console.groq.com

Usage

node src/cli.js [options] <query>

Options:
  -P, --provider <provider>    LLM provider to use       (default: ollama)
                               ollama | anthropic | openai | azure | bedrock | vertexai | groq
  -m, --model <model>          Model / deployment name   (skips the picker)
  -H, --headless               Run browser headlessly
  -v, --verbose                Print tool calls and debug info
  -s, --screenshot             Auto-save screenshots
  --no-memory                  Disable session memory for this run
  --max-iterations <number>    Cap agent reasoning steps  (default: 50)
  --timeout <ms>               Agent timeout              (default: 300000)
  -V, --version                Show version
  -h, --help                   Show help

Built-in subcommands

# Check connection status for ALL configured providers
node src/cli.js status

# List models for a specific provider
node src/cli.js models                    # Ollama (default)
node src/cli.js models --provider anthropic
node src/cli.js models --provider groq

Examples

Using Ollama (local)

node src/cli.js --provider ollama "Search best JavaScript frameworks in 2025"
node src/cli.js --provider ollama --model mistral "Find iPhone 16 price on Amazon"

Using Anthropic Claude

node src/cli.js --provider anthropic "Go to news.ycombinator.com and list the top 5 stories"
node src/cli.js --provider anthropic --model claude-3-opus-20240229 "Extract the main headline from bbc.com"

Using OpenAI

node src/cli.js --provider openai "Fill the contact form on example.com with name 'Jane Doe'"
node src/cli.js --provider openai --model gpt-4-turbo --verbose "Go to MDN and summarise the Fetch API page"

Using Azure OpenAI

node src/cli.js --provider azure "Go to github.com/trending and take a screenshot"

Using AWS Bedrock

node src/cli.js --provider bedrock --model anthropic.claude-3-5-sonnet-20241022-v2:0 "Search for Node.js tutorials"

Using Google Vertex AI

node src/cli.js --provider vertexai "Go to google.com/maps and search for coffee near me"

Using Groq

node src/cli.js --provider groq --model llama3-70b-8192 "Summarise the front page of reuters.com"

Interactive mode (no --provider flag)

node src/cli.js
# → shown a provider picker, then a model picker, then a task prompt

Headless + screenshot

node src/cli.js --provider openai --headless --screenshot "Go to github.com/trending"

Project structure

ai-browser-cli/
│
├── src/
│   ├── cli.js                 ← Entry point  (Commander CLI + provider selection)
│   │
│   ├── agent/
│   │   ├── agent.js           ← LangGraph ReAct agent + streaming
│   │   ├── tools.js           ← MCP tools (primary) + direct Playwright (fallback)
│   │   └── prompts.js         ← System prompt + task-planning template
│   │
│   ├── llm/
│   │   ├── providers.js       ← Multi-provider LLM factory (all 7 providers)
│   │   └── ollama.js          ← Ollama-specific helpers (health-check, pull)
│   │
│   ├── browser/
│   │   └── playwright.js      ← BrowserController (chromium singleton)
│   │
│   ├── memory/
│   │   └── memory.js          ← SessionMemory + LongTermMemory
│   │
│   └── utils/
│       ├── logger.js          ← Coloured logger factory
│       └── retry.js           ← withRetry / withTimeout helpers
│
├── .env.example
├── .gitignore
├── package.json
└── README.md

Tool system

MCP tools (via `@playwright/mcp`)

When available, the agent runs @playwright/mcp as a subprocess and loads its tools through the MCP protocol. These include:

browser_navigate · browser_click · browser_fill · browser_snapshot · browser_screenshot · browser_press_key · browser_scroll · browser_wait_for

Direct Playwright tools (fallback)

If the MCP server cannot start, the agent falls back to Playwright running in-process:

Tool	Description
`open_url`	Navigate to a URL
`click_element`	Click a CSS-selected element
`type_text`	Fill an input field
`press_key`	Send a keyboard key
`extract_content`	Read page text
`get_page_info`	Current URL + title
`scroll_page`	Scroll the viewport
`wait_for_element`	Wait for an element
`wait`	Fixed-time pause
`take_screenshot`	Capture a screenshot

Memory system

Session memory (default on)

Stores the last 10 query/answer pairs in .memory/session.json
Injected as context into the next run's system prompt
Disable for a single run: --no-memory

Long-term memory (optional)

Key/value JSON store in .memory/long-term.json
Enable via ENABLE_LONG_TERM_MEMORY=true in .env

Troubleshooting

Ollama not connecting

ollama serve                          # start the server
curl http://localhost:11434/api/tags  # verify it responds

Model not installed (Ollama)

ollama pull llama3
node src/cli.js models --provider ollama   # confirm it appears

Cloud provider credentials missing

The CLI prints a setup hint with the exact variables to add. Copy them into your .env file and re-run. You can also verify all providers at once:

node src/cli.js status

Vertex AI authentication error

gcloud auth application-default login

AWS Bedrock access denied

Ensure the IAM policy attached to your credentials includes bedrock:InvokeModel for the target model ARN.

Playwright / browser not found

npm run install:browsers   # installs Chromium
npx playwright install     # installs all browsers

MCP server fails to start

The agent automatically falls back to direct Playwright. Use --verbose to confirm which mode is active.

Security notes

Never pass real passwords as CLI arguments (they appear in shell history).
Store credentials in .env (excluded from version control via .gitignore).
The agent will not retry login more than twice to avoid account lockouts.
No browser cookies or credentials are persisted between runs.

License

MIT

agentic-browser-cli