Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (n8n-nodes-pdf-excel) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

n8n-nodes-pdf-excel

N8N node for processing PDF and Excel files with advanced features including OCR and form handling.

Features

PDF Processing

Basic Features:
- Extract text from PDF files
- Get metadata information
- Process files from path or binary data
Advanced Features (new):
- OCR text extraction using Tesseract.js
- PDF form field processing
- Support for multiple languages (English, Vietnamese)
- Memory efficient processing

Excel Processing

Read Worksheet: Read data from specific worksheet
Get Worksheets: List all worksheets in file
Multiple Formats: Support for .xls and .xlsx formats
Data Validation: Basic data validation for cell values

Requirements

Node.js v18+
n8n v1.0+
TypeScript v5.0+

New dependencies:

Tesseract.js for OCR
pdf-lib for form processing

Installation

In n8n

Go to Settings > Community Nodes
Select Install a node from npm registry
Enter n8n-nodes-pdf-excel
Click Install

Manual Installation

# Install with dependencies
npm install n8n-nodes-pdf-excel tesseract.js pdf-lib

# Or link for development
npm link n8n-nodes-pdf-excel

Usage

Basic PDF Processing

Add "PDF & Excel Processor" node
Select "PDF" as file type
Choose operation:
- Extract Text
- Get Metadata
Provide file path or binary data
Execute node

Advanced PDF Features (New)

Add "PDF & Excel Processor" node
Select "PDF Advanced" as file type
Choose operation:
- Extract Text with OCR
- Process Form Fields
Optional: Configure OCR settings
Execute node

Excel Processing

Add "PDF & Excel Processor" node
Select "Excel" as file type
Choose operation:
- Read Worksheet
- Get Worksheets
For worksheet reading:
- Specify sheet name (optional)
Execute node

Development

Setup

git clone https://github.com/your-repo/n8n-nodes-pdf-excel.git
cd n8n-nodes-pdf-excel
npm install

Build

npm run build

Test

npm test

Lint

npm run lint

Roadmap

Basic PDF text extraction
Basic Excel data reading
Advanced PDF features (OCR, forms)
Advanced Excel features (formulas, styling)
Performance optimizations

License

MIT

Contributing

Contributions are welcome! Please read our contributing guidelines for details.