Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (n8n-nodes-pdf-excel) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
n8n-nodes-pdf-excel
N8N node for processing PDF and Excel files with advanced features including OCR and form handling.
Features
PDF Processing
- Basic Features:
- Extract text from PDF files
- Get metadata information
- Process files from path or binary data
- Advanced Features (new):
- OCR text extraction using Tesseract.js
- PDF form field processing
- Support for multiple languages (English, Vietnamese)
- Memory efficient processing
Excel Processing
- Read Worksheet: Read data from specific worksheet
- Get Worksheets: List all worksheets in file
- Multiple Formats: Support for .xls and .xlsx formats
- Data Validation: Basic data validation for cell values
Requirements
- Node.js v18+
- n8n v1.0+
- TypeScript v5.0+
New dependencies:
- Tesseract.js for OCR
- pdf-lib for form processing
Installation
In n8n
- Go to Settings > Community Nodes
- Select Install a node from npm registry
- Enter
n8n-nodes-pdf-excel
- Click Install
Manual Installation
# Install with dependencies
npm install n8n-nodes-pdf-excel tesseract.js pdf-lib
# Or link for development
npm link n8n-nodes-pdf-excel
Usage
Basic PDF Processing
- Add "PDF & Excel Processor" node
- Select "PDF" as file type
- Choose operation:
- Extract Text
- Get Metadata
- Provide file path or binary data
- Execute node
Advanced PDF Features (New)
- Add "PDF & Excel Processor" node
- Select "PDF Advanced" as file type
- Choose operation:
- Extract Text with OCR
- Process Form Fields
- Optional: Configure OCR settings
- Execute node
Excel Processing
- Add "PDF & Excel Processor" node
- Select "Excel" as file type
- Choose operation:
- Read Worksheet
- Get Worksheets
- For worksheet reading:
- Specify sheet name (optional)
- Execute node
Development
Setup
git clone https://github.com/your-repo/n8n-nodes-pdf-excel.git
cd n8n-nodes-pdf-excel
npm install
Build
npm run build
Test
npm test
Lint
npm run lint
Roadmap
- Basic PDF text extraction
- Basic Excel data reading
- Advanced PDF features (OCR, forms)
- Advanced Excel features (formulas, styling)
- Performance optimizations
License
MIT
Contributing
Contributions are welcome! Please read our contributing guidelines for details.