Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (n8n-nodes-pandoc) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

n8n-nodes-pandoc

This is an n8n community node that allows you to convert DOCX files to Markdown using Pandoc.

n8n is a fair-code licensed workflow automation platform.

Prerequisites

Before using this node, you need to have Pandoc installed on your system:

Windows

# Using Chocolatey
choco install pandoc

# Using Scoop
scoop install pandoc

# Or download from: https://github.com/jgm/pandoc/releases

macOS

# Using Homebrew
brew install pandoc

# Using MacPorts
sudo port install pandoc

Linux

# Ubuntu/Debian
sudo apt-get install pandoc

# CentOS/RHEL/Fedora
sudo yum install pandoc
# or
sudo dnf install pandoc

# Arch Linux
sudo pacman -S pandoc

Installation

Follow the installation guide in the n8n community nodes documentation.

Go to Settings > Community Nodes
Select Install
Enter n8n-nodes-pandoc as the package name
Select Install

After installation, the Pandoc Converter node will be available in your n8n instance.

Operations

Convert DOCX to Markdown

Converts a DOCX file to Markdown format using Pandoc with full image support.

Configuration

Input Data Field Name: The name of the binary property containing the DOCX file (default: "data")
Output Data Field Name: The name of the binary property where the converted Markdown will be stored (default: "data")
Extract Images: Whether to extract and process images from the DOCX file (default: true)
Image Output Format: How to handle extracted images:
- Embed as Base64: Images are embedded directly in the Markdown as base64 data URLs (self-contained)
- Save as Separate Files: Images are saved as separate binary files that can be accessed individually
Additional Pandoc Options: Optional command-line arguments to pass to Pandoc (e.g., --wrap=none --reference-links)

Example Workflow

HTTP Request node to download a DOCX file
Pandoc Converter node to convert DOCX to Markdown with images
Write Binary File node to save the Markdown file
(Optional) Additional nodes to process extracted image files

Image Handling

Base64 Embedding (Default):

Images are embedded directly in the Markdown as data:image/png;base64,iVBOR... URLs
Creates a single, self-contained Markdown file
Perfect for sharing or storing as a single document
Preserves original alt text from the document
Larger file size due to base64 encoding

Base64 Embedding (Compact):

Images are embedded as base64 with simplified alt text (just filename)
Significantly smaller markdown file size compared to regular base64
Still self-contained but more readable
Better for documents with many images or very long alt text
Handles large images (>5MB) with size warnings

Reference Links (Base64):

Uses markdown reference-style links with base64 data
Cleaner, more readable markdown content
Base64 data stored at the end of the document
Better compatibility with markdown viewers
Example: ![image1.png][img_1] with [img_1]: data:image/png;base64,... at bottom

Separate Files:

Images are extracted as separate binary data properties
Markdown references images by filename
Allows individual processing of images
Smaller Markdown file size
Each image accessible as image_1_filename.png, image_2_filename.jpg, etc.

Conversion Metadata

The node provides detailed metadata about the conversion:

{
  "conversion": {
    "originalFileName": "document.docx",
    "outputFileName": "document.md",
    "originalSize": 45234,
    "convertedSize": 12456,
    "timestamp": "2024-01-15T10:30:00.000Z",
    "extractedImages": true,
    "imageOutputFormat": "base64",
    "imageCount": 3
  }
}

Additional Pandoc Options Examples

--wrap=none - Disable text wrapping
--standalone - Produce a standalone document
--toc - Include table of contents
--reference-links - Use reference-style links
--preserve-tabs - Preserve tabs in code blocks

Compatibility

This node has been tested with:

n8n version 1.0.0+
Pandoc version 2.0+

Resources

License

MIT