Package Exports
- parseflow-mcp-server
- parseflow-mcp-server/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (parseflow-mcp-server) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
ParseFlow MCP Server
Model Context Protocol (MCP) server for comprehensive PDF parsing and analysis.
🚀 Features
- Text Extraction: Extract text from PDF files with multiple formatting strategies
- Metadata Retrieval: Get PDF document information (title, author, pages, etc.)
- Keyword Search: Search for specific text within PDF documents
- Image Extraction: Extract images from PDF files (requires poppler-utils)
- Table of Contents: Get bookmarks and navigation structure
📦 Installation
Global Installation (Recommended for MCP)
npm install -g parseflow-mcp-serverLocal Installation
npm install parseflow-mcp-server🔧 Usage
With Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"parseflow": {
"command": "parseflow"
}
}
}With Windsurf / Cursor
Add to your MCP settings:
{
"mcpServers": {
"parseflow": {
"command": "parseflow",
"args": []
}
}
}Standalone Usage
# Run the server
parseflow
# Or with custom options
node /path/to/parseflow-mcp-server/dist/index.js🛠️ Available Tools
When connected via MCP, the following tools are available:
1. extract_text
Extract text content from PDF files.
Parameters:
path(string, required): Absolute path to PDF filepage(number, optional): Extract specific pagerange(string, optional): Extract page range (e.g., "1-10")strategy(string, optional): Extraction strategy -raw,formatted, orclean
2. get_metadata
Get PDF document metadata and properties.
Parameters:
path(string, required): Absolute path to PDF file
3. search_pdf
Search for keywords or phrases within a PDF.
Parameters:
path(string, required): Absolute path to PDF filequery(string, required): Search term or phrasecaseSensitive(boolean, optional): Case-sensitive search (default: false)maxResults(number, optional): Maximum results to return (default: 10)
4. extract_images
Extract images from PDF files (requires poppler-utils).
Parameters:
path(string, required): Absolute path to PDF fileoutputDir(string, required): Directory to save extracted imagesformat(string, optional): Output format -pngorjpg(default: png)
5. get_toc
Get table of contents (bookmarks) from PDF.
Parameters:
path(string, required): Absolute path to PDF file
📋 Requirements
- Node.js: >= 18.0.0
- poppler-utils (optional, for image extraction):
- macOS:
brew install poppler - Ubuntu/Debian:
apt-get install poppler-utils - Windows: Download from poppler releases
- macOS:
🔗 Related Packages
- parseflow-core: Core PDF parsing library
- Use
parseflow-coredirectly if you want to integrate PDF parsing into your Node.js applications
📖 Documentation
Full documentation: https://github.com/Libres-coder/ParseFlow
🐛 Bug Reports
Report issues: https://github.com/Libres-coder/ParseFlow/issues
📄 License
MIT © Libres-coder
🌟 MCP Registry
Find this server on the official MCP Registry:
https://registry.modelcontextprotocol.io/
Search for: parseflow