Package Exports
- agent-browser
- agent-browser/dist/daemon.js
- agent-browser/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (agent-browser) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
agent-browser
Headless browser automation CLI for AI agents.
Installation
pnpm install
npx playwright install chromium
pnpm buildQuick Start
agent-browser open example.com
agent-browser click "#submit"
agent-browser fill "#email" "test@example.com"
agent-browser get text "h1"
agent-browser screenshot page.png
agent-browser closeCommands
Core Commands
agent-browser open <url> # Navigate to URL
agent-browser click <sel> # Click element
agent-browser dblclick <sel> # Double-click element
agent-browser focus <sel> # Focus element
agent-browser type <sel> <text> # Type into element
agent-browser fill <sel> <text> # Clear and fill
agent-browser press <key> # Press key (Enter, Tab, Control+a)
agent-browser keydown <key> # Hold key down
agent-browser keyup <key> # Release key
agent-browser insert <text> # Insert text (no key events)
agent-browser hover <sel> # Hover element
agent-browser select <sel> <val> # Select dropdown option
agent-browser multiselect <sel> <v1> <v2> # Multi-select
agent-browser check <sel> # Check checkbox
agent-browser uncheck <sel> # Uncheck checkbox
agent-browser scroll <dir> [px] # Scroll (up/down/left/right)
agent-browser scrollinto <sel> # Scroll element into view
agent-browser drag <src> <tgt> # Drag and drop
agent-browser upload <sel> <files> # Upload files
agent-browser download [path] # Wait for download
agent-browser screenshot [path] # Take screenshot (--full for full page)
agent-browser pdf <path> # Save as PDF
agent-browser snapshot # Accessibility tree (best for AI)
agent-browser eval <js> # Run JavaScript
agent-browser close # Close browserGet Info
agent-browser get text <sel> # Get text content
agent-browser get html <sel> # Get innerHTML
agent-browser get value <sel> # Get input value
agent-browser get attr <sel> <attr> # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count <sel> # Count matching elements
agent-browser get box <sel> # Get bounding boxCheck State
agent-browser is visible <sel> # Check if visible
agent-browser is enabled <sel> # Check if enabled
agent-browser is checked <sel> # Check if checkedFind Elements (Semantic Locators)
agent-browser find role <role> <action> [value] # By ARIA role
agent-browser find text <text> <action> # By text content
agent-browser find label <label> <action> [value] # By label
agent-browser find placeholder <ph> <action> [value] # By placeholder
agent-browser find alt <text> <action> # By alt text
agent-browser find title <text> <action> # By title attr
agent-browser find testid <id> <action> [value] # By data-testid
agent-browser find first <sel> <action> [value] # First match
agent-browser find last <sel> <action> [value] # Last match
agent-browser find nth <n> <sel> <action> [value] # Nth matchActions: click, fill, check, hover, text
Examples:
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "test@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" textWait
agent-browser wait <selector> # Wait for element
agent-browser wait <ms> # Wait for time
agent-browser wait --text "Welcome" # Wait for text
agent-browser wait --url "**/dash" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for load state
agent-browser wait --fn "window.ready === true" # Wait for JS conditionLoad states: load, domcontentloaded, networkidle
Mouse Control
agent-browser mouse move <x> <y> # Move mouse
agent-browser mouse down [button] # Press button (left/right/middle)
agent-browser mouse up [button] # Release button
agent-browser mouse wheel <dy> [dx] # Scroll wheelBrowser Settings
agent-browser set viewport <w> <h> # Set viewport size
agent-browser set device <name> # Emulate device ("iPhone 14")
agent-browser set geo <lat> <lng> # Set geolocation
agent-browser set offline [on|off] # Toggle offline mode
agent-browser set headers <json> # Extra HTTP headers
agent-browser set credentials <u> <p> # HTTP basic auth
agent-browser set media [dark|light|print] # Emulate mediaCookies & Storage
agent-browser cookies # Get all cookies
agent-browser cookies set <json> # Set cookies
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local <key> # Get specific key
agent-browser storage local set <k> <v> # Set value
agent-browser storage local clear # Clear all
agent-browser storage session # Same for sessionStorageNetwork
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body <json> # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
agent-browser response <url> # Get response body (waits for matching request)Tabs & Windows
agent-browser tab # List tabs
agent-browser tab new # New tab
agent-browser tab <n> # Switch to tab n
agent-browser tab close [n] # Close tab
agent-browser window new # New windowFrames
agent-browser frame <sel> # Switch to iframe
agent-browser frame main # Back to main frameDialogs
agent-browser dialog accept [text] # Accept (with optional prompt text)
agent-browser dialog dismiss # DismissDebug
agent-browser trace start # Start recording trace
agent-browser trace stop <path> # Stop and save trace
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser highlight <sel> # Highlight element
agent-browser state save <path> # Save auth state
agent-browser state load <path> # Load auth state
agent-browser initscript <js> # Run JS on every page loadNavigation
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload pageSessions
agent-browser session # Show current session
agent-browser session list # List active sessionsOptions
| Option | Description |
|---|---|
--session <name> |
Use isolated session (or AGENT_BROWSER_SESSION env) |
--json |
JSON output (for agents) |
--full, -f |
Full page screenshot |
--name, -n |
Locator name filter |
--exact |
Exact text match |
--headed |
Show browser window (not headless) |
--debug |
Debug output |
Sessions
Run multiple isolated browser instances:
# Different sessions
agent-browser --session agent1 open site-a.com
agent-browser --session agent2 open site-b.com
# Or via environment
AGENT_BROWSER_SESSION=agent1 agent-browser click "#btn"
# List all
agent-browser session listSelectors
# CSS
agent-browser click "#id"
agent-browser click ".class"
agent-browser click "div > button"
# Text
agent-browser click "text=Submit"
# XPath
agent-browser click "xpath=//button"
# Semantic (recommended)
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"Agent Mode
Use --json for machine-readable output:
agent-browser snapshot --json
agent-browser get text "h1" --json
agent-browser is visible ".modal" --jsonLicense
Apache-2.0