Package Exports
- pdf2md-fix
- pdf2md-fix/lib/pdf2md.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (pdf2md-fix) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
pdf2md
JavaScript npm library to parse PDF files and convert them into Markdown
Major Changes
See Releases
Usage
Library
const fs = require('fs')
const pdf2md = require('@opendocsg/pdf2md')
const pdfBuffer = fs.readFileSync(filePath)
pdf2md(pdfBuffer, callbacks)
.then(text => {
let outputFile = allOutputPaths[i] + '.md'
console.log(`Writing to ${outputFile}...`)
fs.writeFileSync(path.resolve(outputFile), text)
console.log('Done.')
})
.catch(err => {
console.error(err)
})CLI tool
$ cd [project_folder]
$ npx @opendocsg/pdf2md --inputFolderPath=[your input folder path] --outputFolderPath=[your output folder path] --recursiveIf you are converting recursively on a large number of files you might encounter the error "Allocation failed - JavaScript heap out of memory”. Instead, run the command
$ node lib/pdf2md-cli.js --max-old-space-size=4096 --inputFolderPath=[your input folder path] --outputFolderPath=[your output folder path] --recursiveOptions:
- Input folder path (should exist)
- Output folder path (should exist)
- Recursive - convert all PDFs for folders within folders. Specify the tag if you require recursive, and omit if you don't
Credits
pdf-to-markdown - original project by Johannes Zillmann
pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser