Package Exports
- html-to-document
- html-to-document/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (html-to-document) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
html‑to‑document
Convert any HTML into production‑ready documents — DOCX today, PDF/XLSX.
html‑to‑document parses HTML into an intermediate, format‑agnostic tree and then feeds that tree to adapters (e.g. DOCX, PDF).
Write HTML → get Word, PDFs, spreadsheets, and more — all with one unified TypeScript API.
How It Works
Below is a high-level overview of the conversion pipeline. The library processes the HTML input through optional middleware steps, parses it into a structured intermediate representation, and then delegates to an adapter to generate the desired output format.

The stages are:
- Input: Raw HTML input as a string.
- Middleware: One or more middleware functions can inspect or transform the HTML string before parsing (e.g., sanitization, custom tags).
- Parser: Converts the (possibly modified) HTML string into an array of
DocumentElementobjects, representing a structured AST. - Adapter: Takes the parsed
DocumentElement[]and renders it into the target format (e.g., DOCX, PDF, Markdown) via a registered adapter.
✨ Key Features
| Feature | Description |
|---|---|
| Format‑agnostic core | Converts HTML into a reusable DocumentElement[] structure |
| DOCX adapter (built‑in) | Powered by docx with rich style support |
| Pluggable adapters | Create and add your own adapter for PDF, XLSX, Markdown, etc. |
| Style mapping engine | Define your own css mappings for the adapters and set per‑format defaults |
| Custom tag handlers | Override or extend how any HTML tag is parsed |
| Middleware pipeline | Transform or sanitise HTML before parsing |
📦 Installation
npm install html-to-document🚀 Quick Start
import { init, DocxAdapter } from 'html-to-document';
import fs from 'fs';
const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
});
const html = '<h1>Hello World</h1>';
const buffer = await converter.convert(html, 'docx'); // ↩️ Buffer in Node / Blob in browser
fs.writeFileSync('output.docx', buffer);Registering adapters manually
import { init } from 'html-to-document';
import { DocxAdapter } from 'html-to-document-adapter-docx';
const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
});Tip: you can bundle multiple adapters:
register: [ { format: 'docx', adapter: DocxAdapter }, { format: 'pdf', adapter: PdfAdapter }, ]
The rest of the API stays the same—convert(html, 'docx'), convert(html, 'pdf'), etc.
Need just the parsed structure?
const elements = await converter.parse('<p>Some HTML</p>');
console.log(elements); // => DocumentElement[]📚 Documentation & Demo
| Resource | Link |
|---|---|
| Full Docs | https://html-to-document.vercel.app/ |
| Live Demo (TinyMCE) | https://html-to-document-demo.vercel.app |
🛠 Extending
- Style mappings: fine‑tune CSS → DOCX/PDF with
StyleMapper - Tag handlers: intercept
<custom-tag>→ your ownDocumentElement - Custom adapters: implement
IDocumentConverterto target new formats
See the Extensibility Guide.
🧑💻 Contributing
Contributions are welcome!
Please read CONTRIBUTING.md and follow the Code of Conduct.
📝 Changelog
All notable changes are documented in CHANGELOG.md.
📄 License
ISC — a permissive, MIT‑style license that allows free use, modification, and distribution without requiring permission.