Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (n8n-nodes-pandoc) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
n8n-nodes-pandoc
This is an n8n community node that allows you to convert DOCX files to Markdown using Pandoc.
n8n is a fair-code licensed workflow automation platform.
Prerequisites
Before using this node, you need to have Pandoc installed on your system:
Windows
# Using Chocolatey
choco install pandoc
# Using Scoop
scoop install pandoc
# Or download from: https://github.com/jgm/pandoc/releases
macOS
# Using Homebrew
brew install pandoc
# Using MacPorts
sudo port install pandoc
Linux
# Ubuntu/Debian
sudo apt-get install pandoc
# CentOS/RHEL/Fedora
sudo yum install pandoc
# or
sudo dnf install pandoc
# Arch Linux
sudo pacman -S pandoc
Installation
Follow the installation guide in the n8n community nodes documentation.
- Go to Settings > Community Nodes
- Select Install
- Enter
n8n-nodes-pandoc
as the package name - Select Install
After installation, the Pandoc Converter node will be available in your n8n instance.
Operations
Convert DOCX to Markdown
Converts a DOCX file to Markdown format using Pandoc with full image support.
Configuration
- Input Data Field Name: The name of the binary property containing the DOCX file (default: "data")
- Output Data Field Name: The name of the binary property where the converted Markdown will be stored (default: "data")
- Extract Images: Whether to extract and process images from the DOCX file (default: true)
- Image Output Format: How to handle extracted images:
- Embed as Base64: Images are embedded directly in the Markdown as base64 data URLs (self-contained)
- Save as Separate Files: Images are saved as separate binary files that can be accessed individually
- Additional Pandoc Options: Optional command-line arguments to pass to Pandoc (e.g.,
--wrap=none --reference-links
)
Example Workflow
- HTTP Request node to download a DOCX file
- Pandoc Converter node to convert DOCX to Markdown with images
- Write Binary File node to save the Markdown file
- (Optional) Additional nodes to process extracted image files
Image Handling
Base64 Embedding (Default):
- Images are embedded directly in the Markdown as
data:image/png;base64,iVBOR...
URLs - Creates a single, self-contained Markdown file
- Perfect for sharing or storing as a single document
- Larger file size due to base64 encoding
Separate Files:
- Images are extracted as separate binary data properties
- Markdown references images by filename
- Allows individual processing of images
- Smaller Markdown file size
- Each image accessible as
image_1_filename.png
,image_2_filename.jpg
, etc.
Conversion Metadata
The node provides detailed metadata about the conversion:
{
"conversion": {
"originalFileName": "document.docx",
"outputFileName": "document.md",
"originalSize": 45234,
"convertedSize": 12456,
"timestamp": "2024-01-15T10:30:00.000Z",
"extractedImages": true,
"imageOutputFormat": "base64",
"imageCount": 3
}
}
Additional Pandoc Options Examples
--wrap=none
- Disable text wrapping--standalone
- Produce a standalone document--toc
- Include table of contents--reference-links
- Use reference-style links--preserve-tabs
- Preserve tabs in code blocks
Compatibility
This node has been tested with:
- n8n version 1.0.0+
- Pandoc version 2.0+