Package Exports
- @crewchief/maproom-mcp
- @crewchief/maproom-mcp/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@crewchief/maproom-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
@crewchief/maproom-mcp
Semantic code search powered by PostgreSQL, pgvector, and your choice of embedding provider.
Fast semantic search. One setup command. One line config.
Features
- ✨ Choice of Providers - OpenAI (recommended), Google Vertex AI, or local Ollama
- 🚀 Fast Hybrid Search - Vector similarity + full-text search with PostgreSQL
- 🎯 Semantic Ranking ✨ NEW - Implementations rank first, not tests or docs
- 🔄 Auto-Sync - Watch mode keeps your index up-to-date automatically
- 🌿 Automatic Branch Detection ✨ NEW - Auto-index branches on switch (no manual scan needed)
- 📦 Fully Containerized - Everything runs in Docker, isolated and clean
- 🌳 Multi-Language - Tree-sitter parsing for TypeScript, JavaScript, Rust, and more
- 🔒 Privacy Options - Use local Ollama for 100% private embeddings (no API keys)
Semantic Ranking
Maproom now uses semantic entry point ranking to prioritize code implementations over tests and documentation in search results.
The Problem: Traditional full-text search ranks results by keyword frequency. When you search for "authenticate", documentation mentioning the word 20+ times ranks higher than the actual authenticate() function.
The Solution: Semantic ranking applies kind multipliers to boost implementations (2.5×) and demote tests (0.6×) and docs (0.3-0.6×). Exact symbol matches get an additional 3.0× bonus.
Example
Query: authenticate
Before:
1. Documentation: "Authentication Guide" ← Not helpful
2. Documentation: "User Authentication"
...
8. Function: authenticate() ← What you wanted!After:
1. Function: authenticate() ← Found immediately! ✨
2. Function: authenticate_user()
3. Class: AuthenticatorPerformance
- 17% faster on average (p95 latency: 48ms → 40ms)
- 55% of queries improved by >10%
- All queries <100ms p95 latency
Debug Mode
Enable debug mode to see how scores are calculated:
const results = await search({
query: 'authenticate',
debug: true // Shows score breakdown
})Learn more: See docs/search-ranking.md for complete documentation.
Quick Start
1. Run Setup (First Time Only)
Recommended: OpenAI (fast, low cost)
export OPENAI_API_KEY=sk-...
npx @crewchief/maproom-mcp setup --provider=openaiAlternative: Google Vertex AI (fast, low cost)
export GOOGLE_PROJECT_ID=my-project
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
npx @crewchief/maproom-mcp setup --provider=googleLocal: Ollama (slower, no API key needed)
npx @crewchief/maproom-mcp setup --provider=ollamaThis will (2-5 minutes on first run):
- Download Docker images
- Download embedding model (Ollama only)
- Initialize PostgreSQL with pgvector
- Validate everything works
Devcontainer Support
The maproom-mcp setup command automatically detects Docker-in-Docker environments (devcontainers) and configures the correct workspace path for volume mounting.
How it works:
- Detects if running inside a Docker container
- Discovers the actual host path where
/workspaceis mounted - Automatically sets
WORKSPACE_HOST_PATHbefore starting containers - No manual configuration required
Supported environments:
- VS Code devcontainers
- GitHub Codespaces
- Cursor devcontainers
- Local Docker Desktop
Manual override (if needed):
export WORKSPACE_HOST_PATH=/path/to/workspace
npx @crewchief/maproom-mcp setup --provider=openaiTroubleshooting:
- If detection fails, manually set
WORKSPACE_HOST_PATH - Verify Docker socket access:
docker ps - Check container mounts:
docker inspect $(hostname)
2. Index Your Codebase
Automatic Indexing (Recommended) ✨ NEW
Start the branch watcher to automatically index as you switch branches:
# Set database URL
export MAPROOM_DATABASE_URL="postgresql://maproom:maproom@localhost:5432/maproom"
# Start watcher (Terminal 1)
maproom branch-watch --repo /path/to/your/repo
# Work normally (Terminal 2) - branches auto-index
git checkout feature-auth # Automatically indexed in <1 minuteThe watcher runs continuously and indexes branches automatically when you switch. For more details, see the Automatic Indexing Guide.
Manual Indexing
Alternatively, manually trigger indexing:
With OpenAI:
MAPROOM_EMBEDDING_PROVIDER=openai npx @crewchief/maproom-mcp scan /path/to/your/repoWith Google Vertex AI:
MAPROOM_EMBEDDING_PROVIDER=google npx @crewchief/maproom-mcp scan /path/to/your/repoWith Ollama (local):
MAPROOM_EMBEDDING_PROVIDER=ollama npx @crewchief/maproom-mcp scan /path/to/your/repoOptional: Auto-sync with watch mode
MAPROOM_EMBEDDING_PROVIDER=openai npx @crewchief/maproom-mcp watch /path/to/your/repoThis keeps your index up-to-date as you edit code. Leave it running in a terminal.
3. Add to MCP Configuration
Claude Code (.claude/mcp.json in your project):
{
"mcpServers": {
"maproom": {
"command": "docker",
"args": [
"exec",
"-i",
"maproom-mcp",
"node",
"/app/dist/index.js"
],
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
}
}
}
}Cursor (.cursor/mcp.json in your project):
{
"mcpServers": {
"maproom": {
"command": "docker",
"args": [
"exec",
"-i",
"maproom-mcp",
"node",
"/app/dist/index.js"
],
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
}
}
}
}For Google Vertex AI, use:
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "google",
"GOOGLE_PROJECT_ID": "${GOOGLE_PROJECT_ID}",
"GOOGLE_APPLICATION_CREDENTIALS": "${GOOGLE_APPLICATION_CREDENTIALS}"
}For Ollama (local), use:
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "ollama"
}4. Restart Your MCP Client
Restart Claude Code or Cursor to connect to Maproom.
That's it! Use Maproom tools for semantic code search.
Database Setup
Maproom uses a dual-database architecture with separate PostgreSQL instances for development and testing:
- Development Database (port 5433) - For manual work, CLI commands, and MCP operations
- Test Database (port 5434) - Isolated database for automated tests only
Starting Databases
The setup command starts the development database only (automatic via depends_on in docker-compose.yml):
npx @crewchief/maproom-mcp setup --provider=openaiFor developers/CI needing test isolation, the test database must be started manually (opt-in):
cd ~/.maproom-mcp # or packages/maproom-mcp/config in monorepo
docker compose up -d postgres-testRegular maproom users don't need the test database running.
Running Tests
Tests automatically use the test database (port 5434):
cd packages/maproom-mcp
pnpm testThe test database connection is configured via TEST_MAPROOM_DATABASE_URL environment variable and defaults to the test database.
Schema Initialization
Both databases require manual schema initialization after first start:
# Development database
docker exec -i maproom-postgres psql -U maproom -d maproom < ~/.maproom-mcp/init.sql
# Test database
docker exec -i maproom-postgres-test psql -U maproom -d maproom_test < ~/.maproom-mcp/init.sqlNeed More Details?
See the comprehensive Test Database Setup Guide for:
- Troubleshooting connection issues
- Resetting test database
- Volume management
- CI/CD configuration
- Advanced workflows
System Requirements
- Docker Desktop 4.x+ (Install Docker)
- 4-8 GB RAM available for Docker
- 5 GB disk space (images + model + database)
- Supported OS: macOS, Linux, Windows with WSL2
Verify Docker is running:
docker --version
docker compose versionProvider Comparison
| Provider | Speed | Cost | Setup | Privacy |
|---|---|---|---|---|
| OpenAI | ⚡ Fast | 💵 ~$0.02/1M tokens | API key | ☁️ Cloud |
| ⚡ Fast | 💵 Similar to OpenAI | GCP setup | ☁️ Cloud | |
| Ollama | 🐌 Slow* | 💰 Free | None | 🔒 100% Local |
*Ollama is 5-10x slower without GPU. Requires 8GB+ RAM.
Recommendation: Use OpenAI or Google for best performance. Use Ollama only if you need 100% local processing and have good hardware.
Commands
setup
Initial configuration. Required before first use.
npx @crewchief/maproom-mcp setup --provider=openai
npx @crewchief/maproom-mcp setup --provider=google
npx @crewchief/maproom-mcp setup --provider=ollamascan
Index a repository (run after cloning or major changes).
npx @crewchief/maproom-mcp scan /path/to/repo
npx @crewchief/maproom-mcp scan . # Current directorywatch
Monitor repository for changes and auto-reindex.
npx @crewchief/maproom-mcp watch /path/to/repo
npx @crewchief/maproom-mcp watch --debounce=5000 # Custom debounce (ms)Leave running in a terminal. Press Ctrl+C to stop.
When to Use Spawning vs Daemon
Maproom uses two execution patterns depending on the operation type:
Use Spawning When:
- One-time operations (scan, upsert single files)
- Startup/initialization tasks
- Operations where spawn overhead (<200ms) is negligible
- Example: Initial workspace scan at startup
Why: Spawning overhead (~100-200ms) is negligible compared to operation time (seconds to minutes for scan).
Use Daemon When:
- Repeated operations (search queries)
- Low-latency requirements (<50ms response time)
- Connection pooling beneficial (reuse database connections)
- Example: MCP server search operations (20-50x faster)
Why: Daemon eliminates spawn overhead for every request, achieving <50ms latency for search.
Current Implementation:
- MCP search tool: Uses daemon (correct - repeated operations)
- MCP upsert tool: Uses spawning (correct - one-time file indexing)
- VSCode scan: Uses spawning (correct - one-time workspace indexing)
- VSCode search (future): Will use MCP daemon via extension API
Performance comparison:
- Spawning: ~100-200ms overhead per operation
- Daemon: <1ms overhead per operation (after initial startup)
When NOT to migrate:
- If operation takes >10 seconds (scan, large upserts), spawn overhead is <2% of total time
- If operation runs once at startup (workspace scan), daemon provides no benefit
Progress Indicators
The scan command now shows real-time progress during indexing, making it easy to track what's happening without slowing down performance.
Scan Command Progress
When you run scan, you'll see:
🔍 Scanning worktree: main @ abc12345
Repository: my-repo
Path: /path/to/repo
Processing: 45/100 files (45%)
✅ Completed in 8.3s
📊 Scan Summary:
Files processed: 100
Total chunks: 847
Total size: 2.14 MBFeatures:
- Real-time progress updates (throttled to every 200-500ms to avoid console flooding)
- File and chunk counts as indexing progresses
- Completion timing prominently displayed
- Works in both TTY (interactive terminal) and non-TTY (CI/logging) environments
Default Directory Behavior:
You don't need to specify . for the current directory - it's the default:
# These are equivalent:
npx @crewchief/maproom-mcp scan
npx @crewchief/maproom-mcp scan .
npx @crewchief/maproom-mcp scan /path/to/repo # Or specify a pathVerbose Mode
For more detailed output during debugging:
npx @crewchief/maproom-mcp scan --verboseCurrently shows the same output as default mode, but reserved for future detailed diagnostics.
Performance
Progress tracking adds minimal overhead (<5%) through:
- Atomic counters for thread-safe updates
- Smart throttling (200ms minimum between updates)
- Efficient TTY detection
Troubleshooting
"Connection refused" errors to localhost:11434
Problem: OpenAI or Cohere provider attempting to connect to local Ollama endpoint.
Solution: This was a bug in earlier versions (< 1.2.0). Update to the latest version where provider-aware endpoint validation prevents this issue:
npx @crewchief/maproom-mcp@latest setup --provider=openaiThe fix ensures cloud providers only use their official endpoints, preventing cross-provider endpoint pollution.
Custom endpoint not used
Problem: Set EMBEDDING_API_ENDPOINT but provider uses default.
Solution: Ensure the endpoint domain matches your provider:
- OpenAI: Must contain "openai.com"
- Cohere: Must contain "cohere"
- Ollama/Local: Any endpoint accepted
- Google: Ignores
EMBEDDING_API_ENDPOINT(uses region-based endpoint)
Example of correct custom endpoint:
# ✅ Correct: OpenAI custom endpoint (contains "openai.com")
export EMBEDDING_API_ENDPOINT=https://api.openai.com/v1/embeddings
# ❌ Wrong: Ollama endpoint for OpenAI provider (ignored)
export EMBEDDING_API_ENDPOINT=http://localhost:11434Database "column updated_at does not exist" errors
Problem: Missing column in database schema.
Solution: Run database migrations. The maproom binary automatically applies migrations on startup:
npx @crewchief/maproom-mcp setup --provider=<your-provider>Or manually apply migrations by restarting containers:
docker compose -f ~/.maproom-mcp/docker-compose.yml restart"Setup required!" error
Run the setup command with your chosen provider:
npx @crewchief/maproom-mcp setup --provider=openaiContainers not starting
- Verify Docker is running:
docker info - Check for port conflicts:
lsof -i :5433 # PostgreSQL lsof -i :11434 # Ollama (if using)
- Re-run setup
Database errors
Reset everything:
docker compose -f ~/.maproom-mcp/docker-compose.yml down -v
npx @crewchief/maproom-mcp setup --provider=<your-provider>Slow indexing with Ollama
Ollama is CPU-bound without GPU. Consider:
- Using OpenAI or Google instead (much faster)
- Adding a GPU to your system
- Reducing batch size:
EMBEDDING_BATCH_SIZE=10(slower but lower memory)
Enable diagnostic mode
MAPROOM_MCP_DEBUG=true npx @crewchief/maproom-mcp setupData Persistence
All data is stored in Docker volumes:
maproom-data- PostgreSQL database (indexed code + embeddings)ollama-models- Downloaded Ollama models (if using Ollama)maproom-logs- MCP server logs
Your indexed code persists between sessions. To completely reset:
docker volume rm maproom-data ollama-models maproom-logsDatabase Schema
Core Tables
chunks table - Code chunks with worktree tracking
chunk_id- UUID primary keyblob_sha- Content-addressed SHA (links to embeddings)relpath- File path relative to repository rootsymbol_name- Function/class/symbol namecontent- Source code textworktree_ids- JSONB array of worktree IDs containing this chunkstart_line,end_line- Line range in filecreated_at,updated_at- Timestamps
worktree_index_state table - Tracks last indexed git tree SHA per worktree
worktree_id- Foreign key to worktrees tablelast_tree_sha- Git tree SHA fromgit rev-parse HEAD^{tree}last_indexed- Timestamp of last successful scanchunks_processed- Cumulative count for monitoringembeddings_generated- Cost tracking metric
code_embeddings table - Cached embeddings for content deduplication
blob_sha- Primary key (content-addressed)embedding- Vector embedding (pgvector type)model- Embedding model namedimension- Vector dimension
Indexes
GIN index on worktree_ids - Enables efficient worktree filtering
CREATE INDEX idx_chunks_worktree_ids
ON maproom.chunks USING gin(worktree_ids);Supports JSONB operators:
WHERE worktree_ids ? '2'- Find chunks in worktree 2WHERE worktree_ids ?| ARRAY['2', '5']- Find chunks in any of multiple worktrees
Branch-Aware Features
Content deduplication: Same code across branches shares single embedding (via blob_sha)
Incremental updates: Tree SHA comparison enables instant "no changes" detection (<100ms)
Worktree filtering: Search code from specific branch/worktree
See also: Branch-Aware Indexing Architecture for complete technical details
Database Connection
The Maproom MCP server uses intelligent connection fallback to detect and connect to the PostgreSQL database automatically.
Connection Priority
The system tries these methods in order:
MAPROOM_DATABASE_URL (explicit config) - If set, uses this connection string exactly
export MAPROOM_DATABASE_URL="postgresql://user:pass@host:port/dbname"
MAPROOM_DB_HOST (component override) - If MAPROOM_DATABASE_URL not set, constructs connection from parts
export MAPROOM_DB_HOST="custom-host" export MAPROOM_DB_PORT="5432" # optional, defaults to 5432
maproom-postgres (auto-detection) - Attempts to connect to maproom-postgres hostname
- Works automatically in Docker environments
- No configuration needed if maproom-postgres container is running (default)
localhost:5433 (fallback) - Development fallback for local testing
- Useful for local postgres instances on non-standard port
Troubleshooting Connection Issues
Can't connect to database:
Verify maproom-postgres is running:
docker ps | grep maproom-postgres
Start if needed:
docker compose -f ~/.maproom-mcp/docker-compose.yml up -d
Check logs:
docker logs maproom-postgres
Connection refused:
- Verify port 5432 (internal) or 5433 (host) is not blocked
- Check network connectivity:
docker network inspect maproom-network
Hostname not found:
- Verify you're in correct Docker network
- Try setting MAPROOM_DATABASE_URL explicitly:
export MAPROOM_DATABASE_URL="postgresql://maproom:maproom@127.0.0.1:5433/maproom"
Custom database setup: If you want to use your own PostgreSQL instance instead of the bundled one:
export MAPROOM_DATABASE_URL="postgresql://myuser:mypass@myhost:5432/mydb"
npx @crewchief/maproom-mcp scan /path/to/codeAdvanced Configuration
Custom Database
Override the default database connection:
{
"mcpServers": {
"maproom": {
"command": "docker",
"args": [
"exec",
"-i",
"maproom-mcp",
"node",
"/app/dist/index.js"
],
"env": {
"MAPROOM_DATABASE_URL": "postgresql://user:pass@custom-host:5432/mydb",
"MAPROOM_EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
}
}
}
}Custom Embedding Models
OpenAI:
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "openai",
"MAPROOM_EMBEDDING_MODEL": "text-embedding-3-large",
"EMBEDDING_DIMENSION": "3072"
}Google:
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "google",
"MAPROOM_EMBEDDING_MODEL": "textembedding-gecko@003"
}Ollama:
"env": {
"MAPROOM_EMBEDDING_PROVIDER": "ollama",
"MAPROOM_EMBEDDING_MODEL": "mxbai-embed-large"
}Batch Size Tuning
Adjust embedding batch size (default: 50):
"env": {
"EMBEDDING_BATCH_SIZE": "100"
}Higher = faster but more memory. Lower = slower but less memory.
Search Tool - Semantic Code Search
New in v2.1.0: The
searchtool now automatically scopes results to your current git branch, eliminating result duplication and making search results more relevant to your active work.
The search MCP tool performs semantic code search across your indexed codebase using hybrid search (vector similarity + full-text search).
Parameters
| Parameter | Type | Description |
|---|---|---|
repo |
string | Required. Repository name (must match indexed name) |
query |
string | Required. Search query (concept or keywords) |
worktree |
string | null | undefined | Optional. Worktree scope: - undefined (default): Auto-detect current branch- "branch-name": Search specific branch- null: Search all worktrees |
limit |
number | Optional. Max results (default: 10) |
mode |
string | Optional. Search mode: "vector", "fts", or "hybrid" (default) |
debug |
boolean | Optional. Include ranking details (default: false) |
Search Modes
The search tool supports three modes:
FTS (Full-Text Search): Fast keyword-based search using PostgreSQL FTS
- Best for: Finding specific function names, error messages, exact terms
- Latency: ~50-100ms
- Requires: Indexed repository
Vector (Semantic Search): AI-powered similarity search using embeddings
- Best for: Conceptual queries, "code that does X", finding similar patterns
- Latency: ~100-200ms
- Requires: Indexed repository + generated embeddings (run
generate-embeddings)
Hybrid (Combined): Merges FTS and vector results with reciprocal rank fusion
- Best for: Most searches - combines precision of FTS with recall of vector
- Latency: ~200-300ms (runs both searches)
- Requires: Same as vector mode
Examples
// Fast keyword search
{ mode: "fts", query: "handleSearch", repo: "crewchief" }
// Semantic similarity search
{ mode: "vector", query: "authentication logic", repo: "crewchief" }
// Best of both worlds
{ mode: "hybrid", query: "error handling patterns", repo: "crewchief" }Mode Selection Guide
- Use FTS when: Looking for specific identifiers, known terms
- Use Vector when: Exploring concepts, finding related code
- Use Hybrid when: Unsure which mode fits, or want comprehensive results
### Worktree-Scoped Search (Auto-Detection)
**Default behavior (v2.1.0+)**: When `worktree` parameter is omitted, the search tool automatically detects your current git branch and scopes results to that branch only.
**Example 1: Auto-detection** (recommended)
```typescript
// In feature-auth branch, searches only feature-auth worktree
const results = await mcp__maproom__search({
repo: "my-repo",
query: "authentication flow"
})
// Returns: { hits: [...], worktree: "feature-auth", auto_detected: true, mode: "auto" }Example 2: Explicit worktree override
// In feature-auth branch, but search main worktree instead
const results = await mcp__maproom__search({
repo: "my-repo",
query: "authentication flow",
worktree: "main"
})
// Returns: { hits: [...], worktree: "main", auto_detected: false, mode: "explicit" }Example 3: Search all worktrees
// Search across all indexed branches
const results = await mcp__maproom__search({
repo: "my-repo",
query: "authentication flow",
worktree: null
})
// Returns: { hits: [...], worktree: null, mode: "all" }File Type Filtering
Filter search results by file extension to focus on specific languages or file types.
Single extension:
const result = await mcp__maproom__search({
repo: 'crewchief',
query: 'authentication',
filters: { file_type: 'ts' }
})
// Returns only TypeScript (.ts) filesMultiple extensions:
const result = await mcp__maproom__search({
repo: 'crewchief',
query: 'authentication',
filters: { file_type: 'ts,tsx,js' }
})
// Returns TypeScript or JavaScript filesCommon patterns:
// Search only documentation
filters: { file_type: 'md,mdx' }
// Search Rust code
filters: { file_type: 'rs' }
// Search frontend code
filters: { file_type: 'tsx,jsx,vue,svelte' }
// Combine with recency filter
filters: {
file_type: 'ts,tsx',
recency_threshold: '7 days'
}
// Returns recent TypeScript files onlySyntax:
- Comma-separated for multiple extensions
- Case insensitive:
"TS"same as"ts" - With or without dot:
".ts"same as"ts" - Maximum 20 extensions per filter
Error handling:
- Empty filter (
"") searches all files (no error) - Too many extensions (>20) returns error with helpful message
- Invalid input normalized or filtered out gracefully
Fallback Behavior
When auto-detection is enabled but the current branch is not indexed, the search tool gracefully falls back to the main worktree with a helpful hint:
// In unindexed feature-xyz branch
const results = await mcp__maproom__search({
repo: "my-repo",
query: "authentication"
})
// Returns:
{
hits: [...], // Results from 'main' worktree
worktree: "main",
mode: "fallback",
hint: "Current branch 'feature-xyz' is not indexed.\n\n" +
"To search your current code:\n" +
"1. Run: mcp__maproom__scan({repo: \"my-repo\", worktree: \"feature-xyz\"})\n\n" +
"Searching 'main' worktree instead."
}If the main worktree is also not indexed, the tool falls back to searching all worktrees.
Result Metadata
Search results include metadata about worktree resolution:
| Field | Type | Description |
|---|---|---|
hits |
array | Search results with content, file paths, and scores |
total |
number | Total number of results returned |
worktree |
string | null | Which worktree was searched |
auto_detected |
boolean | Was worktree auto-detected from git? |
mode |
string | Resolution mode: "explicit", "auto", "fallback", or "all" |
hint |
string | undefined | Helpful message when fallback occurs |
debug |
object | undefined | Ranking details (only if debug: true) |
Performance
- Cache hit rate: >99% for git branch detection (60s TTL)
- Search latency: <10ms with warm cache
- Memory overhead: Minimal (<100 KB for LRU caches)
Troubleshooting
See Troubleshooting section for common issues.
Open Tool - File Retrieval
The open MCP tool retrieves file contents from your indexed codebase with intelligent path resolution and security validation.
Multi-Candidate Fallback
When multiple worktrees exist with the same name (common after repeated indexing), the open tool automatically tries each candidate in order:
- Queries database for all matching worktrees (ordered by most recent ID first)
- Validates each candidate path against the filesystem
- Returns content from the first valid worktree found
This gracefully handles database pollution from:
- Repeated indexing from different working directories
- Repository moves or renames
- Stale database entries
Security Features
Path Traversal Protection:
- Validates all relative paths before filesystem access
- Rejects paths containing
../, absolute paths, or null bytes - Prevents access outside repository boundaries
Symlink Validation:
- Detects symlinks using
fs.lstat()before reading - Resolves symlink targets with
fs.realpath() - Blocks symlinks pointing outside repository boundaries
- Allows legitimate internal symlinks (e.g., shared configs)
File Type Checking:
- Only returns content for regular files
- Directories and special files are rejected
- Ensures
fileExists()helper validates both readability AND file type
Error Messages
| Error Message | Meaning | Recommended Action |
|---|---|---|
File exists in other worktrees: main, develop |
File not found in specified worktree but exists in others | Check worktree parameter spelling or use suggested worktree |
File 'X' not found in worktree 'Y' |
No matching database entry | Ensure repository is indexed and file path is correct |
File 'X' not accessible in worktree 'Y'. Tried N candidates... |
Database pollution detected - multiple entries but none valid on disk | Run maproom db cleanup-stale to remove stale entries |
Path traversal detected: ../../../etc/passwd |
Security violation in input | Use relative paths only, no parent directory references |
Path is outside repository boundaries |
Symlink or resolved path escapes repo | Check symlink targets or file paths |
Null bytes not allowed in path |
Invalid characters in path parameter | Remove null bytes from file path |
Troubleshooting
Issue: "Tried N candidate paths but none exist on disk"
This indicates database pollution - the database has multiple entries for the same worktree name, but none correspond to valid paths on the filesystem.
Diagnosis:
# Check for duplicate worktree entries
docker exec -it maproom-postgres psql -U maproom -d maproom -c \
"SELECT w.name, w.abs_path, COUNT(*)
FROM maproom.worktrees w
GROUP BY w.name, w.abs_path
HAVING COUNT(*) > 1;"Solution:
# Clean up stale database entries
maproom db cleanup-staleIssue: File not found but file definitely exists
Diagnosis:
- Verify the repository is indexed: Check
maproom statusoutput - Verify worktree name: The
worktreeparameter must match the database entry exactly - Check file path: Path must be relative to repository root
Issue: Symlink outside repository
Diagnosis:
# Check where symlink points
readlink /path/to/symlink
# Verify it's within repo boundaries
# Should start with repository root pathSolution:
- Move symlink target inside repository, or
- Access target file directly instead of via symlink
Path Resolution Flow
1. Input Validation
├─ Reject path traversal (../)
├─ Reject absolute paths (/)
└─ Reject null bytes (\0)
2. Database Query
└─ SELECT all matching (worktree, relpath) pairs
ORDER BY worktree.id DESC
3. Multi-Candidate Validation
├─ For each candidate:
│ ├─ Check filesystem existence
│ ├─ Validate within repo boundaries
│ └─ Return if valid
└─ Error if all candidates fail
4. Security Checks
├─ Detect symlinks (fs.lstat)
├─ Validate symlink target (validateWithinRepo)
└─ Verify file type (stats.isFile)
5. Content Retrieval
└─ Read file with size limit validationEnvironment Variables
Provider Configuration
MAPROOM_EMBEDDING_PROVIDER: (Required) One of:openai,cohere,google,ollama,localMAPROOM_EMBEDDING_MODEL: (Required) Model name for the providerEMBEDDING_DIMENSION: (Required) Vector dimension for embeddingsEMBEDDING_API_ENDPOINT: (Optional) Custom endpoint override
Endpoint Configuration
Cloud Providers (OpenAI, Cohere):
- Use official endpoints by default (https://api.openai.com/v1/embeddings, etc.)
EMBEDDING_API_ENDPOINTonly used if domain matches provider- Example: Setting
EMBEDDING_API_ENDPOINT=http://localhost:11434for OpenAI is ignored
Ollama:
- Defaults to
http://localhost:11434/api/embed - Set
EMBEDDING_API_ENDPOINTfor custom Ollama server location
Google Vertex AI:
- Endpoint constructed from
GOOGLE_VERTEX_REGION(e.g.,us-west1) EMBEDDING_API_ENDPOINTis ignored
Local Provider:
- Requires
EMBEDDING_API_ENDPOINTto be set explicitly
Environment Variable Precedence
- Explicit configuration in code (if applicable)
EMBEDDING_API_ENDPOINTenvironment variable (validated by provider)- Provider-specific default endpoint
API Keys
OPENAI_API_KEY: For OpenAI providerCOHERE_API_KEY: For Cohere providerGOOGLE_APPLICATION_CREDENTIALS: For Google Vertex AI
License
MIT - See LICENSE file for details.