JSPM – @orunium/browser-mcp-server@2.0.3

Package Exports

@orunium/browser-mcp-server
@orunium/browser-mcp-server/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@orunium/browser-mcp-server) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

Browser MCP

Automate your browser with AI.
Website • Docs

About

Browser MCP is an MCP server + Chrome extension that allows you to automate your browser using AI applications like VS Code, Claude, Cursor, and Windsurf.

Features

⚡ Fast: Automation happens locally on your machine, resulting in better performance without network latency.
🔒 Private: Since automation happens locally, your browser activity stays on your device and isn't sent to remote servers.
👤 Logged In: Uses your existing browser profile, keeping you logged into all your services.
🥷🏼 Stealth: Avoids basic bot detection and CAPTCHAs by using your real browser fingerprint.

Project Overview

Browser MCP is a server that acts as a bridge between AI applications (like IDEs) and your web browser. It allows the AI to automate browser actions by sending commands to a companion Chrome Extension.

The key goals and features are:

Local Execution: It runs on your machine, making automation fast by avoiding network latency.
Privacy: Your browser data and activity remain on your device.
Persistent Sessions: It uses your actual Chrome profile, so you remain logged into your accounts.
Stealth: By using a real browser, it's less likely to be blocked by bot detection mechanisms.

The project is an adaptation of the Playwright MCP server, but instead of launching new, clean browser instances, it controls your existing browser.

Architecture and Project Structure

The system is composed of two primary components:

The MCP Server (this project): A Node.js application that exposes a set of tools (e.g., navigate, click, type) that an AI can call.
The Chrome Extension (source not included): A browser extension that listens for commands from the MCP server via a WebSocket connection and executes them in the browser.

The server's codebase is well-structured and can be broken down into these core components:

src/index.ts: This is the main entry point for the server. It uses commander to set up the command-line interface, initializes the server with all the available tools, and starts listening for connections.
src/server.ts: This file contains the core logic for the MCP server. The createServerWithTools function creates a Server instance from the @modelcontextprotocol/sdk, registers handlers for different request types (like listing tools or executing a tool), and manages the WebSocket server for extension communication.
src/context.ts: The Context class is a crucial piece of the architecture. It manages the WebSocket (ws) connection to the Chrome extension. Any tool that needs to interact with the browser does so through context.sendSocketMessage(). It also centralizes error handling for connection issues.
src/ws.ts: A simple module responsible for creating and managing the lifecycle of the WebSocket server that the Chrome Extension connects to.
src/tools/: This directory defines the actions the AI can perform.
- tool.ts: Defines the standard interface for all Tool objects.
- common.ts, snapshot.ts, custom.ts: These files categorize and implement the specific tools. For example, common.ts has navigate, while snapshot.ts has tools like click and type that also capture a snapshot of the page's accessibility tree (ARIA snapshot) after the action.

Modules and Dependencies

The project's dependencies are defined in broswer-mcp/package.json.

Key Third-Party Dependencies:

@modelcontextprotocol/sdk: The core dependency for creating an MCP-compliant server.
ws: A popular library for creating WebSocket servers in Node.js, used to communicate with the Chrome extension.
zod & zod-to-json-schema: Used to define strict schemas for tool inputs and automatically generate JSON schemas from them. This ensures that the AI provides valid arguments when calling a tool.
commander: A library for building command-line interfaces.
tsup: A bundler used to compile the TypeScript source code into executable JavaScript.

Monorepo Dependencies:

This project is part of a larger monorepo, and it relies on several internal packages. These are noted with workspace:* in the package.json:

@orunium/messaging: Provides WebSocket utilities.
@orunium/config, @orunium/types, @orunium/utils: These are shared packages containing common configuration, type definitions (like for tool schemas and WebSocket messages), and utility functions used across the monorepo.

CRITICAL NOTE: The README.md explicitly states:

This repo contains all the core MCP code for Browser MCP, but currently cannot yet be built on its own due to dependencies on utils and types from the monorepo where it's developed.

This means you will not be able to build or run this project without having the rest of the monorepo and its dependencies available.

Chrome Extension

The source code for the Chrome Extension is not present in this project. However, we can infer how it works based on the server's code:

The server starts a WebSocket server using the ws library (broswer-mcp/src/ws.ts).
The Chrome Extension connects to this WebSocket server. The connection is then managed by the Context class (broswer-mcp/src/context.ts).
When a tool is executed (e.g., click), the server sends a typed JSON message over the WebSocket (e.g., { "type": "browser_click", "payload": { "element": "button" } }).
The extension listens for these messages, performs the requested browser action (like finding the element and clicking it), and sends a response back to the server. The sendSocketMessage function waits for this response.

How to Build Locally

MCP Server

The package.json provides the necessary scripts to build the server:

  "scripts": {
    "typecheck": "tsc --noEmit",
    "build": "tsup src/index.ts --format esm && shx chmod +x dist/*.js",
    "prepare": "npm run build",
    "watch": "tsup src/index.ts --format esm --watch ",
    "inspector": "CLIENT_PORT=9001 SERVER_PORT=9002 pnpx @modelcontextprotocol/inspector node dist/index.js"
  },

To build the server, you would run:

npm run build

This command compiles the TypeScript code from src/ into a single, executable JavaScript file in the dist/ directory.

However, as mentioned before, this will likely fail unless you have the complete monorepo environment set up.

Chrome Extension

Since the source code for the Chrome Extension is not included in this project, I cannot provide instructions on how to build it.

Summary & Path Forward

This project is the server-side component of a powerful browser automation system. It's well-structured but is tightly coupled with a larger monorepo.

To effectively improve and refactor this project, the following would be necessary:

Full Monorepo Access: Gaining access to the parent monorepo is essential to resolve the workspace dependencies (@orunium/*), build the project successfully, and understand the full context of the shared code.
Extension Source Code: To understand the full picture and debug end-to-end, you would need the source code for the Chrome Extension.

I am ready to assist with any further questions you may have about the existing code. Let me know how you'd like to proceed.