Package Exports

pageindex-mcp
pageindex-mcp/build/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (pageindex-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

PageIndex MCP

A Model Context Protocol (MCP) server for PageIndex - Next-Generation Reasoning-based RAG.

What is PageIndex?

PageIndex is a revolutionary document processing system that uses reasoning-based RAG instead of traditional vector-based similarity search. Unlike conventional RAG systems that rely on semantic similarity, PageIndex uses multi-step reasoning and tree search to retrieve information like a human expert would.

Key Advantages over Vector-based RAG

Higher Accuracy: Relevance beyond similarity - ideal for domain-specific documents where semantics are similar
Better Transparency: Clear reasoning trajectory with traceable search paths
Like A Human: Retrieve information like a human expert navigates documents
No Vector DB: No extra infrastructure overhead
No Chunking: Preserve full document context and structure
No Top-K: Retrieve all relevant passages automatically

Architecture

This MCP server acts as a bridge between Claude/LLM clients and the PageIndex platform:

[Claude/LLM] <--> [PageIndex Local MCP Server] <--> [PageIndex Remote MCP Server]
                         |
                         ├── process_document (local file handling)
                         │   ├── Uploads to PageIndex via signed URLs
                         │   └── Processes with PageIndex OCR + Tree Generation
                         └── Other tools (proxy to PageIndex MCP API)

Features

Local PDF Processing: Upload local PDF files directly without manual uploads
URL Support: Process documents from URLs
Full PageIndex Integration: Access all PageIndex capabilities (OCR, tree generation, reasoning-based retrieval)
Secure Authentication: API key authentication with PageIndex platform
TypeScript: Full type safety with MCP SDK
Claude Desktop Ready: Easy integration with Claude Desktop

Usage

Getting Started

First, you'll need to create an API key:

Visit https://dash.pageindex.ai/api-keys
Create a new API key for your application
Copy the API key for use in the configuration below

Option 1: Local MCP Server (with local PDF upload)

Requirements: Node.js ≥18.0.0

Add to your MCP configuration:

{
  "mcpServers": {
    "pageindex": {
      "command": "npx",
      "args": ["-y", "pageindex-mcp"],
      "env": {
        "PAGEINDEX_API_KEY": "<YOUR_PAGEINDEX_API_KEY>"
      }
    }
  }
}

Option 2: Remote MCP Server

Alternatively, connect directly to PageIndex without this wrapper:

{
  "mcpServers": {
    "pageindex": {
      "type": "http",
      "url": "https://dash.pageindex.ai/api/mcp/mcp",
      "headers": {
        "Authorization": "Bearer <YOUR_PAGEINDEX_API_KEY>"
      }
    }
  }
}

Note: Option 1 provides local PDF upload capabilities, while Option 2 connects directly to PageIndex but requires manual PDF uploads via the dashboard.

Available Tools

Tool	Description	Key Parameters
process_document	Upload and process PDF documents with PageIndex OCR and tree generation	`url` - Local file path or PDF URL
ask_document	Query documents using natural language with reasoning-based retrieval	`doc_id`, `query`, `thinking` (optional)
get_page_content	Extract specific page content from processed documents	`doc_id`, `pages` ("5", "3-7", "1,5,10")
get_document_structure	Extract hierarchical document structure with configurable detail levels	`doc_id`, `mode` ("summary", "outline", "full"), `max_depth` (1-10)
list_documents	List documents with comprehensive status information	`status`, `limit`, `offset` (all optional)
remove_document	Permanently delete documents and associated data	`doc_ids` (array of document IDs)
search_documents	Search document library using keywords	`query`, `status`, `limit` (optional)

Quick Example: Process a local PDF with process_document, then query it with ask_document using the returned document ID.

License

This project is licensed under the terms of the MIT open source license. Please refer to MIT for the full terms.