JSPM

  • Created
  • Published
  • Downloads 11
  • Score
    100M100P100Q38979F
  • License MIT

Maproom MCP server with choice of embedding providers - one setup command, then zero config

Package Exports

  • @crewchief/maproom-mcp
  • @crewchief/maproom-mcp/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@crewchief/maproom-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@crewchief/maproom-mcp

Semantic code search powered by PostgreSQL, pgvector, and your choice of embedding provider.

Fast semantic search. One setup command. One line config.

Features

  • Choice of Providers - OpenAI (recommended), Google Vertex AI, or local Ollama
  • 🚀 Fast Hybrid Search - Vector similarity + full-text search with PostgreSQL
  • 🎯 Semantic Ranking ✨ NEW - Implementations rank first, not tests or docs
  • 🔄 Auto-Sync - Watch mode keeps your index up-to-date automatically
  • 🌿 Automatic Branch Detection ✨ NEW - Auto-index branches on switch (no manual scan needed)
  • 📦 Fully Containerized - Everything runs in Docker, isolated and clean
  • 🌳 Multi-Language - Tree-sitter parsing for TypeScript, JavaScript, Rust, and more
  • 🔒 Privacy Options - Use local Ollama for 100% private embeddings (no API keys)

Semantic Ranking

Maproom now uses semantic entry point ranking to prioritize code implementations over tests and documentation in search results.

The Problem: Traditional full-text search ranks results by keyword frequency. When you search for "authenticate", documentation mentioning the word 20+ times ranks higher than the actual authenticate() function.

The Solution: Semantic ranking applies kind multipliers to boost implementations (2.5×) and demote tests (0.6×) and docs (0.3-0.6×). Exact symbol matches get an additional 3.0× bonus.

Example

Query: authenticate

Before:

1. Documentation: "Authentication Guide" ← Not helpful
2. Documentation: "User Authentication"
...
8. Function: authenticate() ← What you wanted!

After:

1. Function: authenticate() ← Found immediately! ✨
2. Function: authenticate_user()
3. Class: Authenticator

Performance

  • 17% faster on average (p95 latency: 48ms → 40ms)
  • 55% of queries improved by >10%
  • All queries <100ms p95 latency

Debug Mode

Enable debug mode to see how scores are calculated:

const results = await search({
  query: 'authenticate',
  debug: true  // Shows score breakdown
})

Learn more: See docs/search-ranking.md for complete documentation.

Quick Start

1. Run Setup (First Time Only)

Recommended: OpenAI (fast, low cost)

export OPENAI_API_KEY=sk-...
npx @crewchief/maproom-mcp setup --provider=openai

Alternative: Google Vertex AI (fast, low cost)

export GOOGLE_PROJECT_ID=my-project
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
npx @crewchief/maproom-mcp setup --provider=google

Local: Ollama (slower, no API key needed)

npx @crewchief/maproom-mcp setup --provider=ollama

This will (2-5 minutes on first run):

  • Download Docker images
  • Download embedding model (Ollama only)
  • Initialize PostgreSQL with pgvector
  • Validate everything works

Devcontainer Support

The maproom-mcp setup command automatically detects Docker-in-Docker environments (devcontainers) and configures the correct workspace path for volume mounting.

How it works:

  1. Detects if running inside a Docker container
  2. Discovers the actual host path where /workspace is mounted
  3. Automatically sets WORKSPACE_HOST_PATH before starting containers
  4. No manual configuration required

Supported environments:

  • VS Code devcontainers
  • GitHub Codespaces
  • Cursor devcontainers
  • Local Docker Desktop

Manual override (if needed):

export WORKSPACE_HOST_PATH=/path/to/workspace
npx @crewchief/maproom-mcp setup --provider=openai

Troubleshooting:

  • If detection fails, manually set WORKSPACE_HOST_PATH
  • Verify Docker socket access: docker ps
  • Check container mounts: docker inspect $(hostname)

2. Index Your Codebase

Start the branch watcher to automatically index as you switch branches:

# Set database URL
export MAPROOM_DATABASE_URL="postgresql://maproom:maproom@localhost:5432/maproom"

# Start watcher (Terminal 1)
maproom branch-watch --repo /path/to/your/repo

# Work normally (Terminal 2) - branches auto-index
git checkout feature-auth  # Automatically indexed in <1 minute

The watcher runs continuously and indexes branches automatically when you switch. For more details, see the Automatic Indexing Guide.

Manual Indexing

Alternatively, manually trigger indexing:

With OpenAI:

MAPROOM_EMBEDDING_PROVIDER=openai npx @crewchief/maproom-mcp scan /path/to/your/repo

With Google Vertex AI:

MAPROOM_EMBEDDING_PROVIDER=google npx @crewchief/maproom-mcp scan /path/to/your/repo

With Ollama (local):

MAPROOM_EMBEDDING_PROVIDER=ollama npx @crewchief/maproom-mcp scan /path/to/your/repo

Optional: Auto-sync with watch mode

MAPROOM_EMBEDDING_PROVIDER=openai npx @crewchief/maproom-mcp watch /path/to/your/repo

This keeps your index up-to-date as you edit code. Leave it running in a terminal.

3. Add to MCP Configuration

Claude Code (.claude/mcp.json in your project):

{
  "mcpServers": {
    "maproom": {
      "command": "docker",
      "args": [
        "exec",
        "-i",
        "maproom-mcp",
        "node",
        "/app/dist/index.js"
      ],
      "env": {
        "MAPROOM_EMBEDDING_PROVIDER": "openai",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}"
      }
    }
  }
}

Cursor (.cursor/mcp.json in your project):

{
  "mcpServers": {
    "maproom": {
      "command": "docker",
      "args": [
        "exec",
        "-i",
        "maproom-mcp",
        "node",
        "/app/dist/index.js"
      ],
      "env": {
        "MAPROOM_EMBEDDING_PROVIDER": "openai",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}"
      }
    }
  }
}

For Google Vertex AI, use:

"env": {
  "MAPROOM_EMBEDDING_PROVIDER": "google",
  "GOOGLE_PROJECT_ID": "${GOOGLE_PROJECT_ID}",
  "GOOGLE_APPLICATION_CREDENTIALS": "${GOOGLE_APPLICATION_CREDENTIALS}"
}

For Ollama (local), use:

"env": {
  "MAPROOM_EMBEDDING_PROVIDER": "ollama"
}

4. Restart Your MCP Client

Restart Claude Code or Cursor to connect to Maproom.

That's it! Use Maproom tools for semantic code search.


Database Setup

Maproom uses a dual-database architecture with separate PostgreSQL instances for development and testing:

  • Development Database (port 5433) - For manual work, CLI commands, and MCP operations
  • Test Database (port 5434) - Isolated database for automated tests only

Starting Databases

The setup command starts the development database only (automatic via depends_on in docker-compose.yml):

npx @crewchief/maproom-mcp setup --provider=openai

For developers/CI needing test isolation, the test database must be started manually (opt-in):

cd ~/.maproom-mcp  # or packages/maproom-mcp/config in monorepo
docker compose up -d postgres-test

Regular maproom users don't need the test database running.

Running Tests

Tests automatically use the test database (port 5434):

cd packages/maproom-mcp
pnpm test

The test database connection is configured via TEST_MAPROOM_DATABASE_URL environment variable and defaults to the test database.

Schema Initialization

Both databases require manual schema initialization after first start:

# Development database
docker exec -i maproom-postgres psql -U maproom -d maproom < ~/.maproom-mcp/init.sql

# Test database
docker exec -i maproom-postgres-test psql -U maproom -d maproom_test < ~/.maproom-mcp/init.sql

Need More Details?

See the comprehensive Test Database Setup Guide for:

  • Troubleshooting connection issues
  • Resetting test database
  • Volume management
  • CI/CD configuration
  • Advanced workflows

System Requirements

  • Docker Desktop 4.x+ (Install Docker)
  • 4-8 GB RAM available for Docker
  • 5 GB disk space (images + model + database)
  • Supported OS: macOS, Linux, Windows with WSL2

Verify Docker is running:

docker --version
docker compose version

Provider Comparison

Provider Speed Cost Setup Privacy
OpenAI ⚡ Fast 💵 ~$0.02/1M tokens API key ☁️ Cloud
Google ⚡ Fast 💵 Similar to OpenAI GCP setup ☁️ Cloud
Ollama 🐌 Slow* 💰 Free None 🔒 100% Local

*Ollama is 5-10x slower without GPU. Requires 8GB+ RAM.

Recommendation: Use OpenAI or Google for best performance. Use Ollama only if you need 100% local processing and have good hardware.


Commands

setup

Initial configuration. Required before first use.

npx @crewchief/maproom-mcp setup --provider=openai
npx @crewchief/maproom-mcp setup --provider=google
npx @crewchief/maproom-mcp setup --provider=ollama

scan

Index a repository (run after cloning or major changes).

npx @crewchief/maproom-mcp scan /path/to/repo
npx @crewchief/maproom-mcp scan .  # Current directory

watch

Monitor repository for changes and auto-reindex.

npx @crewchief/maproom-mcp watch /path/to/repo
npx @crewchief/maproom-mcp watch --debounce=5000  # Custom debounce (ms)

Leave running in a terminal. Press Ctrl+C to stop.


When to Use Spawning vs Daemon

Maproom uses two execution patterns depending on the operation type:

Use Spawning When:

  • One-time operations (scan, upsert single files)
  • Startup/initialization tasks
  • Operations where spawn overhead (<200ms) is negligible
  • Example: Initial workspace scan at startup

Why: Spawning overhead (~100-200ms) is negligible compared to operation time (seconds to minutes for scan).

Use Daemon When:

  • Repeated operations (search queries)
  • Low-latency requirements (<50ms response time)
  • Connection pooling beneficial (reuse database connections)
  • Example: MCP server search operations (20-50x faster)

Why: Daemon eliminates spawn overhead for every request, achieving <50ms latency for search.

Current Implementation:

  • MCP search tool: Uses daemon (correct - repeated operations)
  • MCP upsert tool: Uses spawning (correct - one-time file indexing)
  • VSCode scan: Uses spawning (correct - one-time workspace indexing)
  • VSCode search (future): Will use MCP daemon via extension API

Performance comparison:

  • Spawning: ~100-200ms overhead per operation
  • Daemon: <1ms overhead per operation (after initial startup)

When NOT to migrate:

  • If operation takes >10 seconds (scan, large upserts), spawn overhead is <2% of total time
  • If operation runs once at startup (workspace scan), daemon provides no benefit

Progress Indicators

The scan command now shows real-time progress during indexing, making it easy to track what's happening without slowing down performance.

Scan Command Progress

When you run scan, you'll see:

🔍 Scanning worktree: main @ abc12345
   Repository: my-repo
   Path: /path/to/repo

Processing: 45/100 files (45%)
✅ Completed in 8.3s

📊 Scan Summary:
   Files processed: 100
   Total chunks: 847
   Total size: 2.14 MB

Features:

  • Real-time progress updates (throttled to every 200-500ms to avoid console flooding)
  • File and chunk counts as indexing progresses
  • Completion timing prominently displayed
  • Works in both TTY (interactive terminal) and non-TTY (CI/logging) environments

Default Directory Behavior: You don't need to specify . for the current directory - it's the default:

# These are equivalent:
npx @crewchief/maproom-mcp scan
npx @crewchief/maproom-mcp scan .
npx @crewchief/maproom-mcp scan /path/to/repo  # Or specify a path

Verbose Mode

For more detailed output during debugging:

npx @crewchief/maproom-mcp scan --verbose

Currently shows the same output as default mode, but reserved for future detailed diagnostics.

Performance

Progress tracking adds minimal overhead (<5%) through:

  • Atomic counters for thread-safe updates
  • Smart throttling (200ms minimum between updates)
  • Efficient TTY detection

Troubleshooting

"Connection refused" errors to localhost:11434

Problem: OpenAI or Cohere provider attempting to connect to local Ollama endpoint.

Solution: This was a bug in earlier versions (< 1.2.0). Update to the latest version where provider-aware endpoint validation prevents this issue:

npx @crewchief/maproom-mcp@latest setup --provider=openai

The fix ensures cloud providers only use their official endpoints, preventing cross-provider endpoint pollution.

Custom endpoint not used

Problem: Set EMBEDDING_API_ENDPOINT but provider uses default.

Solution: Ensure the endpoint domain matches your provider:

  • OpenAI: Must contain "openai.com"
  • Cohere: Must contain "cohere"
  • Ollama/Local: Any endpoint accepted
  • Google: Ignores EMBEDDING_API_ENDPOINT (uses region-based endpoint)

Example of correct custom endpoint:

# ✅ Correct: OpenAI custom endpoint (contains "openai.com")
export EMBEDDING_API_ENDPOINT=https://api.openai.com/v1/embeddings

# ❌ Wrong: Ollama endpoint for OpenAI provider (ignored)
export EMBEDDING_API_ENDPOINT=http://localhost:11434

Database "column updated_at does not exist" errors

Problem: Missing column in database schema.

Solution: Run database migrations. The maproom binary automatically applies migrations on startup:

npx @crewchief/maproom-mcp setup --provider=<your-provider>

Or manually apply migrations by restarting containers:

docker compose -f ~/.maproom-mcp/docker-compose.yml restart

"Setup required!" error

Run the setup command with your chosen provider:

npx @crewchief/maproom-mcp setup --provider=openai

Containers not starting

  1. Verify Docker is running: docker info
  2. Check for port conflicts:
    lsof -i :5433  # PostgreSQL
    lsof -i :11434 # Ollama (if using)
  3. Re-run setup

Database errors

Reset everything:

docker compose -f ~/.maproom-mcp/docker-compose.yml down -v
npx @crewchief/maproom-mcp setup --provider=<your-provider>

Slow indexing with Ollama

Ollama is CPU-bound without GPU. Consider:

  • Using OpenAI or Google instead (much faster)
  • Adding a GPU to your system
  • Reducing batch size: EMBEDDING_BATCH_SIZE=10 (slower but lower memory)

Enable diagnostic mode

MAPROOM_MCP_DEBUG=true npx @crewchief/maproom-mcp setup

Data Persistence

All data is stored in Docker volumes:

  • maproom-data - PostgreSQL database (indexed code + embeddings)
  • ollama-models - Downloaded Ollama models (if using Ollama)
  • maproom-logs - MCP server logs

Your indexed code persists between sessions. To completely reset:

docker volume rm maproom-data ollama-models maproom-logs

Database Schema

Core Tables

chunks table - Code chunks with worktree tracking

  • chunk_id - UUID primary key
  • blob_sha - Content-addressed SHA (links to embeddings)
  • relpath - File path relative to repository root
  • symbol_name - Function/class/symbol name
  • content - Source code text
  • worktree_ids - JSONB array of worktree IDs containing this chunk
  • start_line, end_line - Line range in file
  • created_at, updated_at - Timestamps

worktree_index_state table - Tracks last indexed git tree SHA per worktree

  • worktree_id - Foreign key to worktrees table
  • last_tree_sha - Git tree SHA from git rev-parse HEAD^{tree}
  • last_indexed - Timestamp of last successful scan
  • chunks_processed - Cumulative count for monitoring
  • embeddings_generated - Cost tracking metric

code_embeddings table - Cached embeddings for content deduplication

  • blob_sha - Primary key (content-addressed)
  • embedding - Vector embedding (pgvector type)
  • model - Embedding model name
  • dimension - Vector dimension

Indexes

GIN index on worktree_ids - Enables efficient worktree filtering

CREATE INDEX idx_chunks_worktree_ids
ON maproom.chunks USING gin(worktree_ids);

Supports JSONB operators:

  • WHERE worktree_ids ? '2' - Find chunks in worktree 2
  • WHERE worktree_ids ?| ARRAY['2', '5'] - Find chunks in any of multiple worktrees

Branch-Aware Features

Content deduplication: Same code across branches shares single embedding (via blob_sha)

Incremental updates: Tree SHA comparison enables instant "no changes" detection (<100ms)

Worktree filtering: Search code from specific branch/worktree

See also: Branch-Aware Indexing Architecture for complete technical details


Database Connection

The Maproom MCP server uses intelligent connection fallback to detect and connect to the PostgreSQL database automatically.

Connection Priority

The system tries these methods in order:

  1. MAPROOM_DATABASE_URL (explicit config) - If set, uses this connection string exactly

    export MAPROOM_DATABASE_URL="postgresql://user:pass@host:port/dbname"
  2. MAPROOM_DB_HOST (component override) - If MAPROOM_DATABASE_URL not set, constructs connection from parts

    export MAPROOM_DB_HOST="custom-host"
    export MAPROOM_DB_PORT="5432"  # optional, defaults to 5432
  3. maproom-postgres (auto-detection) - Attempts to connect to maproom-postgres hostname

    • Works automatically in Docker environments
    • No configuration needed if maproom-postgres container is running (default)
  4. localhost:5433 (fallback) - Development fallback for local testing

    • Useful for local postgres instances on non-standard port

Troubleshooting Connection Issues

Can't connect to database:

  1. Verify maproom-postgres is running:

    docker ps | grep maproom-postgres
  2. Start if needed:

    docker compose -f ~/.maproom-mcp/docker-compose.yml up -d
  3. Check logs:

    docker logs maproom-postgres

Connection refused:

  • Verify port 5432 (internal) or 5433 (host) is not blocked
  • Check network connectivity:
    docker network inspect maproom-network

Hostname not found:

  • Verify you're in correct Docker network
  • Try setting MAPROOM_DATABASE_URL explicitly:
    export MAPROOM_DATABASE_URL="postgresql://maproom:maproom@127.0.0.1:5433/maproom"

Custom database setup: If you want to use your own PostgreSQL instance instead of the bundled one:

export MAPROOM_DATABASE_URL="postgresql://myuser:mypass@myhost:5432/mydb"
npx @crewchief/maproom-mcp scan /path/to/code

Advanced Configuration

Custom Database

Override the default database connection:

{
  "mcpServers": {
    "maproom": {
      "command": "docker",
      "args": [
        "exec",
        "-i",
        "maproom-mcp",
        "node",
        "/app/dist/index.js"
      ],
      "env": {
        "MAPROOM_DATABASE_URL": "postgresql://user:pass@custom-host:5432/mydb",
        "MAPROOM_EMBEDDING_PROVIDER": "openai",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}"
      }
    }
  }
}

Custom Embedding Models

OpenAI:

"env": {
  "MAPROOM_EMBEDDING_PROVIDER": "openai",
  "MAPROOM_EMBEDDING_MODEL": "text-embedding-3-large",
  "EMBEDDING_DIMENSION": "3072"
}

Google:

"env": {
  "MAPROOM_EMBEDDING_PROVIDER": "google",
  "MAPROOM_EMBEDDING_MODEL": "textembedding-gecko@003"
}

Ollama:

"env": {
  "MAPROOM_EMBEDDING_PROVIDER": "ollama",
  "MAPROOM_EMBEDDING_MODEL": "mxbai-embed-large"
}

Batch Size Tuning

Adjust embedding batch size (default: 50):

"env": {
  "EMBEDDING_BATCH_SIZE": "100"
}

Higher = faster but more memory. Lower = slower but less memory.


New in v2.1.0: The search tool now automatically scopes results to your current git branch, eliminating result duplication and making search results more relevant to your active work.

The search MCP tool performs semantic code search across your indexed codebase using hybrid search (vector similarity + full-text search).

Parameters

Parameter Type Description
repo string Required. Repository name (must match indexed name)
query string Required. Search query (concept or keywords)
worktree string | null | undefined Optional. Worktree scope:
- undefined (default): Auto-detect current branch
- "branch-name": Search specific branch
- null: Search all worktrees
limit number Optional. Max results (default: 10)
mode string Optional. Search mode: "vector", "fts", or "hybrid" (default)
debug boolean Optional. Include ranking details (default: false)

Search Modes

The search tool supports three modes:

  • FTS (Full-Text Search): Fast keyword-based search using PostgreSQL FTS

    • Best for: Finding specific function names, error messages, exact terms
    • Latency: ~50-100ms
    • Requires: Indexed repository
  • Vector (Semantic Search): AI-powered similarity search using embeddings

    • Best for: Conceptual queries, "code that does X", finding similar patterns
    • Latency: ~100-200ms
    • Requires: Indexed repository + generated embeddings (run generate-embeddings)
  • Hybrid (Combined): Merges FTS and vector results with reciprocal rank fusion

    • Best for: Most searches - combines precision of FTS with recall of vector
    • Latency: ~200-300ms (runs both searches)
    • Requires: Same as vector mode

Examples

// Fast keyword search
{ mode: "fts", query: "handleSearch", repo: "crewchief" }

// Semantic similarity search
{ mode: "vector", query: "authentication logic", repo: "crewchief" }

// Best of both worlds
{ mode: "hybrid", query: "error handling patterns", repo: "crewchief" }

Mode Selection Guide

  • Use FTS when: Looking for specific identifiers, known terms
  • Use Vector when: Exploring concepts, finding related code
  • Use Hybrid when: Unsure which mode fits, or want comprehensive results

### Worktree-Scoped Search (Auto-Detection)

**Default behavior (v2.1.0+)**: When `worktree` parameter is omitted, the search tool automatically detects your current git branch and scopes results to that branch only.

**Example 1: Auto-detection** (recommended)
```typescript
// In feature-auth branch, searches only feature-auth worktree
const results = await mcp__maproom__search({
  repo: "my-repo",
  query: "authentication flow"
})
// Returns: { hits: [...], worktree: "feature-auth", auto_detected: true, mode: "auto" }

Example 2: Explicit worktree override

// In feature-auth branch, but search main worktree instead
const results = await mcp__maproom__search({
  repo: "my-repo",
  query: "authentication flow",
  worktree: "main"
})
// Returns: { hits: [...], worktree: "main", auto_detected: false, mode: "explicit" }

Example 3: Search all worktrees

// Search across all indexed branches
const results = await mcp__maproom__search({
  repo: "my-repo",
  query: "authentication flow",
  worktree: null
})
// Returns: { hits: [...], worktree: null, mode: "all" }

File Type Filtering

Filter search results by file extension to focus on specific languages or file types.

Single extension:

const result = await mcp__maproom__search({
  repo: 'crewchief',
  query: 'authentication',
  filters: { file_type: 'ts' }
})
// Returns only TypeScript (.ts) files

Multiple extensions:

const result = await mcp__maproom__search({
  repo: 'crewchief',
  query: 'authentication',
  filters: { file_type: 'ts,tsx,js' }
})
// Returns TypeScript or JavaScript files

Common patterns:

// Search only documentation
filters: { file_type: 'md,mdx' }

// Search Rust code
filters: { file_type: 'rs' }

// Search frontend code
filters: { file_type: 'tsx,jsx,vue,svelte' }

// Combine with recency filter
filters: {
  file_type: 'ts,tsx',
  recency_threshold: '7 days'
}
// Returns recent TypeScript files only

Syntax:

  • Comma-separated for multiple extensions
  • Case insensitive: "TS" same as "ts"
  • With or without dot: ".ts" same as "ts"
  • Maximum 20 extensions per filter

Error handling:

  • Empty filter ("") searches all files (no error)
  • Too many extensions (>20) returns error with helpful message
  • Invalid input normalized or filtered out gracefully

Fallback Behavior

When auto-detection is enabled but the current branch is not indexed, the search tool gracefully falls back to the main worktree with a helpful hint:

// In unindexed feature-xyz branch
const results = await mcp__maproom__search({
  repo: "my-repo",
  query: "authentication"
})

// Returns:
{
  hits: [...],  // Results from 'main' worktree
  worktree: "main",
  mode: "fallback",
  hint: "Current branch 'feature-xyz' is not indexed.\n\n" +
        "To search your current code:\n" +
        "1. Run: mcp__maproom__scan({repo: \"my-repo\", worktree: \"feature-xyz\"})\n\n" +
        "Searching 'main' worktree instead."
}

If the main worktree is also not indexed, the tool falls back to searching all worktrees.

Result Metadata

Search results include metadata about worktree resolution:

Field Type Description
hits array Search results with content, file paths, and scores
total number Total number of results returned
worktree string | null Which worktree was searched
auto_detected boolean Was worktree auto-detected from git?
mode string Resolution mode: "explicit", "auto", "fallback", or "all"
hint string | undefined Helpful message when fallback occurs
debug object | undefined Ranking details (only if debug: true)

Performance

  • Cache hit rate: >99% for git branch detection (60s TTL)
  • Search latency: <10ms with warm cache
  • Memory overhead: Minimal (<100 KB for LRU caches)

Troubleshooting

See Troubleshooting section for common issues.


Open Tool - File Retrieval

The open MCP tool retrieves file contents from your indexed codebase with intelligent path resolution and security validation.

Multi-Candidate Fallback

When multiple worktrees exist with the same name (common after repeated indexing), the open tool automatically tries each candidate in order:

  1. Queries database for all matching worktrees (ordered by most recent ID first)
  2. Validates each candidate path against the filesystem
  3. Returns content from the first valid worktree found

This gracefully handles database pollution from:

  • Repeated indexing from different working directories
  • Repository moves or renames
  • Stale database entries

Security Features

Path Traversal Protection:

  • Validates all relative paths before filesystem access
  • Rejects paths containing ../, absolute paths, or null bytes
  • Prevents access outside repository boundaries

Symlink Validation:

  • Detects symlinks using fs.lstat() before reading
  • Resolves symlink targets with fs.realpath()
  • Blocks symlinks pointing outside repository boundaries
  • Allows legitimate internal symlinks (e.g., shared configs)

File Type Checking:

  • Only returns content for regular files
  • Directories and special files are rejected
  • Ensures fileExists() helper validates both readability AND file type

Error Messages

Error Message Meaning Recommended Action
File exists in other worktrees: main, develop File not found in specified worktree but exists in others Check worktree parameter spelling or use suggested worktree
File 'X' not found in worktree 'Y' No matching database entry Ensure repository is indexed and file path is correct
File 'X' not accessible in worktree 'Y'. Tried N candidates... Database pollution detected - multiple entries but none valid on disk Run maproom db cleanup-stale to remove stale entries
Path traversal detected: ../../../etc/passwd Security violation in input Use relative paths only, no parent directory references
Path is outside repository boundaries Symlink or resolved path escapes repo Check symlink targets or file paths
Null bytes not allowed in path Invalid characters in path parameter Remove null bytes from file path

Troubleshooting

Issue: "Tried N candidate paths but none exist on disk"

This indicates database pollution - the database has multiple entries for the same worktree name, but none correspond to valid paths on the filesystem.

Diagnosis:

# Check for duplicate worktree entries
docker exec -it maproom-postgres psql -U maproom -d maproom -c \
  "SELECT w.name, w.abs_path, COUNT(*)
   FROM maproom.worktrees w
   GROUP BY w.name, w.abs_path
   HAVING COUNT(*) > 1;"

Solution:

# Clean up stale database entries
maproom db cleanup-stale

Issue: File not found but file definitely exists

Diagnosis:

  • Verify the repository is indexed: Check maproom status output
  • Verify worktree name: The worktree parameter must match the database entry exactly
  • Check file path: Path must be relative to repository root

Issue: Symlink outside repository

Diagnosis:

# Check where symlink points
readlink /path/to/symlink

# Verify it's within repo boundaries
# Should start with repository root path

Solution:

  • Move symlink target inside repository, or
  • Access target file directly instead of via symlink

Path Resolution Flow

1. Input Validation
   ├─ Reject path traversal (../)
   ├─ Reject absolute paths (/)
   └─ Reject null bytes (\0)

2. Database Query
   └─ SELECT all matching (worktree, relpath) pairs
      ORDER BY worktree.id DESC

3. Multi-Candidate Validation
   ├─ For each candidate:
   │  ├─ Check filesystem existence
   │  ├─ Validate within repo boundaries
   │  └─ Return if valid
   └─ Error if all candidates fail

4. Security Checks
   ├─ Detect symlinks (fs.lstat)
   ├─ Validate symlink target (validateWithinRepo)
   └─ Verify file type (stats.isFile)

5. Content Retrieval
   └─ Read file with size limit validation

Environment Variables

Provider Configuration

  • MAPROOM_EMBEDDING_PROVIDER: (Required) One of: openai, cohere, google, ollama, local
  • MAPROOM_EMBEDDING_MODEL: (Required) Model name for the provider
  • EMBEDDING_DIMENSION: (Required) Vector dimension for embeddings
  • EMBEDDING_API_ENDPOINT: (Optional) Custom endpoint override

Endpoint Configuration

Cloud Providers (OpenAI, Cohere):

  • Use official endpoints by default (https://api.openai.com/v1/embeddings, etc.)
  • EMBEDDING_API_ENDPOINT only used if domain matches provider
  • Example: Setting EMBEDDING_API_ENDPOINT=http://localhost:11434 for OpenAI is ignored

Ollama:

  • Defaults to http://localhost:11434/api/embed
  • Set EMBEDDING_API_ENDPOINT for custom Ollama server location

Google Vertex AI:

  • Endpoint constructed from GOOGLE_VERTEX_REGION (e.g., us-west1)
  • EMBEDDING_API_ENDPOINT is ignored

Local Provider:

  • Requires EMBEDDING_API_ENDPOINT to be set explicitly

Environment Variable Precedence

  1. Explicit configuration in code (if applicable)
  2. EMBEDDING_API_ENDPOINT environment variable (validated by provider)
  3. Provider-specific default endpoint

API Keys

  • OPENAI_API_KEY: For OpenAI provider
  • COHERE_API_KEY: For Cohere provider
  • GOOGLE_APPLICATION_CREDENTIALS: For Google Vertex AI

License

MIT - See LICENSE file for details.