JSPM

agentic-browser-cli

1.0.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 4
  • Score
    100M100P100Q55528F
  • License MIT

AI-powered browser automation CLI — automate the web with natural language using Ollama, Anthropic, OpenAI, Azure, AWS Bedrock, Google Vertex AI, or Groq

Package Exports

  • agentic-browser-cli
  • agentic-browser-cli/src/cli.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (agentic-browser-cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

AI Browser CLI

An AI-powered browser automation CLI that accepts natural-language queries and executes web tasks autonomously — fully local, no cloud APIs required.


How it works

User Query (CLI)
      │
      ▼
Provider Selection  ←  ollama | anthropic | openai | azure | bedrock | vertexai | groq
      │
      ▼
LangChain ReAct Agent
      │
      ├── LLM Layer (chosen provider)
      │     ├── Ollama          (local, no API key)
      │     ├── Anthropic Claude
      │     ├── OpenAI
      │     ├── Azure OpenAI
      │     ├── AWS Bedrock
      │     ├── Google Vertex AI
      │     └── Groq
      │
      ├── Tools Layer
      │     ├── Primary  : @playwright/mcp  (MCP subprocess)
      │     └── Fallback : Direct Playwright (in-process)
      │
      └── Session Memory       ← optional context carry-over

Prerequisites

Requirement Version Notes
Node.js ≥ 18 nodejs.org
At least one LLM provider — choose any:
Ollama (local) any ollama serve + ollama pull llama3
Anthropic Claude ANTHROPIC_API_KEY in .env
OpenAI OPENAI_API_KEY in .env
Azure OpenAI AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT
AWS Bedrock AWS credentials in .env or IAM role
Google Vertex AI GOOGLE_CLOUD_PROJECT + gcloud auth
Groq GROQ_API_KEY in .env

Installation

# 1. Install npm dependencies
npm install

# 2. Install the Chromium browser binary used by Playwright
npm run install:browsers

# 3. Copy and edit environment variables
cp .env.example .env

Optional: global install

npm link
# then call: ai-browser "…"

Configuration (.env)

Copy .env.example to .env and fill in the values for the provider(s) you want to use.

Common settings

Variable Default Description
DEFAULT_PROVIDER ollama Provider used when --provider is omitted
HEADLESS false Set true to run the browser without a visible window
BROWSER_TIMEOUT 30000 Timeout (ms) for each browser action
MAX_ITERATIONS 50 Maximum agent reasoning steps per query
AGENT_TIMEOUT 300000 Hard timeout (ms) for the full agent run
MEMORY_DIR .memory Folder where session data is stored
DEBUG false Set true to enable verbose output

Ollama (local)

Variable Default Description
OLLAMA_BASE_URL http://localhost:11434 Ollama server endpoint
DEFAULT_MODEL llama3 Default model when --model is omitted

Anthropic Claude

Variable Description
ANTHROPIC_API_KEY API key from console.anthropic.com

OpenAI

Variable Description
OPENAI_API_KEY API key from platform.openai.com

Azure OpenAI

Variable Description
AZURE_OPENAI_API_KEY Azure resource API key
AZURE_OPENAI_ENDPOINT https://<resource>.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT Deployment / model name
AZURE_OPENAI_API_VERSION API version (default 2024-10-21)

AWS Bedrock

Variable Description
AWS_ACCESS_KEY_ID AWS access key (or use IAM role)
AWS_SECRET_ACCESS_KEY AWS secret key
AWS_SESSION_TOKEN Optional session token
AWS_REGION Region (default us-east-1)

Google Vertex AI

Variable Description
GOOGLE_CLOUD_PROJECT GCP project ID
GOOGLE_CLOUD_LOCATION Region (default us-central1)

Run gcloud auth application-default login before using Vertex AI.

Groq

Variable Description
GROQ_API_KEY API key from console.groq.com

Usage

node src/cli.js [options] <query>

Options:
  -P, --provider <provider>    LLM provider to use       (default: ollama)
                               ollama | anthropic | openai | azure | bedrock | vertexai | groq
  -m, --model <model>          Model / deployment name   (skips the picker)
  -H, --headless               Run browser headlessly
  -v, --verbose                Print tool calls and debug info
  -s, --screenshot             Auto-save screenshots
  --no-memory                  Disable session memory for this run
  --max-iterations <number>    Cap agent reasoning steps  (default: 50)
  --timeout <ms>               Agent timeout              (default: 300000)
  -V, --version                Show version
  -h, --help                   Show help

Built-in subcommands

# Check connection status for ALL configured providers
node src/cli.js status

# List models for a specific provider
node src/cli.js models                    # Ollama (default)
node src/cli.js models --provider anthropic
node src/cli.js models --provider groq

Examples

Using Ollama (local)

node src/cli.js --provider ollama "Search best JavaScript frameworks in 2025"
node src/cli.js --provider ollama --model mistral "Find iPhone 16 price on Amazon"

Using Anthropic Claude

node src/cli.js --provider anthropic "Go to news.ycombinator.com and list the top 5 stories"
node src/cli.js --provider anthropic --model claude-3-opus-20240229 "Extract the main headline from bbc.com"

Using OpenAI

node src/cli.js --provider openai "Fill the contact form on example.com with name 'Jane Doe'"
node src/cli.js --provider openai --model gpt-4-turbo --verbose "Go to MDN and summarise the Fetch API page"

Using Azure OpenAI

node src/cli.js --provider azure "Go to github.com/trending and take a screenshot"

Using AWS Bedrock

node src/cli.js --provider bedrock --model anthropic.claude-3-5-sonnet-20241022-v2:0 "Search for Node.js tutorials"

Using Google Vertex AI

node src/cli.js --provider vertexai "Go to google.com/maps and search for coffee near me"

Using Groq

node src/cli.js --provider groq --model llama3-70b-8192 "Summarise the front page of reuters.com"

Interactive mode (no --provider flag)

node src/cli.js
# → shown a provider picker, then a model picker, then a task prompt

Headless + screenshot

node src/cli.js --provider openai --headless --screenshot "Go to github.com/trending"

Project structure

ai-browser-cli/
│
├── src/
│   ├── cli.js                 ← Entry point  (Commander CLI + provider selection)
│   │
│   ├── agent/
│   │   ├── agent.js           ← LangGraph ReAct agent + streaming
│   │   ├── tools.js           ← MCP tools (primary) + direct Playwright (fallback)
│   │   └── prompts.js         ← System prompt + task-planning template
│   │
│   ├── llm/
│   │   ├── providers.js       ← Multi-provider LLM factory (all 7 providers)
│   │   └── ollama.js          ← Ollama-specific helpers (health-check, pull)
│   │
│   ├── browser/
│   │   └── playwright.js      ← BrowserController (chromium singleton)
│   │
│   ├── memory/
│   │   └── memory.js          ← SessionMemory + LongTermMemory
│   │
│   └── utils/
│       ├── logger.js          ← Coloured logger factory
│       └── retry.js           ← withRetry / withTimeout helpers
│
├── .env.example
├── .gitignore
├── package.json
└── README.md

Tool system

MCP tools (via @playwright/mcp)

When available, the agent runs @playwright/mcp as a subprocess and loads its tools through the MCP protocol. These include:

browser_navigate · browser_click · browser_fill · browser_snapshot · browser_screenshot · browser_press_key · browser_scroll · browser_wait_for

Direct Playwright tools (fallback)

If the MCP server cannot start, the agent falls back to Playwright running in-process:

Tool Description
open_url Navigate to a URL
click_element Click a CSS-selected element
type_text Fill an input field
press_key Send a keyboard key
extract_content Read page text
get_page_info Current URL + title
scroll_page Scroll the viewport
wait_for_element Wait for an element
wait Fixed-time pause
take_screenshot Capture a screenshot

Memory system

Session memory (default on)

  • Stores the last 10 query/answer pairs in .memory/session.json
  • Injected as context into the next run's system prompt
  • Disable for a single run: --no-memory

Long-term memory (optional)

  • Key/value JSON store in .memory/long-term.json
  • Enable via ENABLE_LONG_TERM_MEMORY=true in .env

Troubleshooting

Ollama not connecting

ollama serve                          # start the server
curl http://localhost:11434/api/tags  # verify it responds

Model not installed (Ollama)

ollama pull llama3
node src/cli.js models --provider ollama   # confirm it appears

Cloud provider credentials missing

The CLI prints a setup hint with the exact variables to add. Copy them into your .env file and re-run. You can also verify all providers at once:

node src/cli.js status

Vertex AI authentication error

gcloud auth application-default login

AWS Bedrock access denied

Ensure the IAM policy attached to your credentials includes bedrock:InvokeModel for the target model ARN.

Playwright / browser not found

npm run install:browsers   # installs Chromium
npx playwright install     # installs all browsers

MCP server fails to start

The agent automatically falls back to direct Playwright. Use --verbose to confirm which mode is active.


Security notes

  • Never pass real passwords as CLI arguments (they appear in shell history).
  • Store credentials in .env (excluded from version control via .gitignore).
  • The agent will not retry login more than twice to avoid account lockouts.
  • No browser cookies or credentials are persisted between runs.

License

MIT