Package Exports
- webpage2pdf
- webpage2pdf/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (webpage2pdf) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
webpage2pdf
A powerful tool to convert web pages, HTML strings, Buffers, or Streams to PDF files. Supports both command-line and function call usage, with advanced features like stream output and multi-page merging.
Installation
Option 1: Global Installation (Recommended)
Using npm:
npm install -g webpage2pdfOr using pnpm:
pnpm add -g webpage2pdfThen you can use it directly:
webpage2pdf https://www.example.comOption 2: Local Installation
Using npm:
npm install webpage2pdf
# Run with npx
npx webpage2pdf https://www.example.comOr using pnpm:
pnpm add webpage2pdf
# Run with pnpm exec
pnpm exec webpage2pdf https://www.example.comOption 3: Install as Dependency
Using npm:
npm install webpage2pdf --saveOr using pnpm:
pnpm add webpage2pdfUsage
Method 1: Command Line
Basic Usage
# Convert a webpage to PDF (default: uses page title as filename)
webpage2pdf https://www.example.com
# Output: ./Example_202512251539.pdf
# Specify output path
webpage2pdf https://www.example.com -o ./my-pdf.pdf
# Specify page size
webpage2pdf https://www.example.com -s A4_PRINT
# Wait longer (ensure page fully loads)
webpage2pdf https://www.example.com -w 5000
# Wait for specific element
webpage2pdf https://www.example.com --selector "button"Command Line Options
| Option | Short | Description | Default |
|---|---|---|---|
--output |
-o |
Output file path | Uses page title (document.title) |
--size |
-s |
Page size (A4, A4_PRINT, A3, LETTER) | A4 |
--wait |
-w |
Wait time (milliseconds) | 3000 |
--selector |
Selector to wait for (e.g., button, #content) |
None | |
--header |
Custom headers (format: key:value, can be used multiple times) |
None |
Method 2: Function Call
Basic Usage
const { generatePdf } = require('webpage2pdf');
// Basic usage
async function example() {
const result = await generatePdf('https://www.example.com', './output.pdf');
if (result.success) {
console.log('PDF generated successfully:', result.path);
console.log('File size:', result.size, 'bytes');
console.log('Page title:', result.title);
} else {
console.error('Generation failed:', result.error);
}
}
example();Advanced Usage
const { generatePdf, PAGE_SIZE_CONFIG, setVerbose } = require('webpage2pdf');
// Disable verbose logging (recommended for function calls)
setVerbose(false);
async function advancedExample() {
const result = await generatePdf('https://example.com', './output.pdf', {
pageSize: 'A4_PRINT', // Page size
waitTime: 5000, // Wait time (milliseconds)
selector: '#content', // Wait for specific element
headers: { // Custom headers
'Authorization': 'Bearer token',
'X-Custom-Header': 'value'
}
});
if (result.success) {
console.log('Success:', result.path);
}
}
advancedExample();API Documentation
generatePdf(input, outputPath, options)
Convert a webpage or HTML to PDF.
Parameters:
input(string|string[]|Buffer|Readable, required) - Input:string- URL or HTML stringstring[]- URL array (multi-page merge)Buffer- HTML BufferReadable- HTML Stream
outputPath(string|null, optional) - Output file pathstring- Save to filenull- Return Stream
options(object, optional) - Configuration optionspageSize(string) - Page size, options:A4,A4_PRINT,A3,LETTER, default:A4waitTime(number) - Wait time (milliseconds), default:3000selector(string) - Selector to wait for (optional), default:nullheaders(object) - Custom headers (optional), default:{}margin(object) - Page margins, format:{top, right, bottom, left}, unit: mm, default:{top: '0mm', right: '0mm', bottom: '0mm', left: '0mm'}scale(number) - Scale factor (0.1-2), default:1printBackground(boolean) - Print background, default:trueignore(string|RegExp|Array) - Errors to ignore (string, regex, or array), default:[]debug(boolean) - Output debug information, default:false
Return Value:
{
success: boolean, // Whether successful
path?: string, // Output file path (when file output)
stream?: Readable, // PDF Stream (when stream output)
size: number, // File size (bytes)
title?: string, // Page title (when URL input)
error?: string, // Error message (when failed)
ignored?: boolean // Whether ignored (when error ignored)
}setVerbose(verbose)
Set whether to output verbose logs.
Parameters:
verbose(boolean) - Whether to output logs, default:true
PAGE_SIZE_CONFIG
Page size configuration object containing all available page sizes.
Examples
Example 1: Convert Public Webpage
# Command line
webpage2pdf https://www.example.com -o example.pdf// Function call
const { generatePdf } = require('webpage2pdf');
await generatePdf('https://www.example.com', './example.pdf');Example 2: Convert Authenticated Page
# Command line
webpage2pdf https://api.example.com/page \
--header "Authorization:Bearer token" \
-o authenticated.pdf// Function call
const { generatePdf } = require('webpage2pdf');
await generatePdf('https://api.example.com/page', './authenticated.pdf', {
headers: {
'Authorization': 'Bearer token'
}
});Example 3: Wait for Dynamic Content
# Command line
webpage2pdf https://example.com/dynamic-page \
--selector "#content" \
-w 10000 \
-o dynamic-page.pdf// Function call
const { generatePdf } = require('webpage2pdf');
await generatePdf('https://example.com/dynamic-page', './dynamic-page.pdf', {
selector: '#content',
waitTime: 10000
});Example 4: Stream Output
// Function call - Return Stream
const { generatePdf } = require('webpage2pdf');
const fs = require('fs');
const result = await generatePdf('https://example.com', null);
if (result.success) {
result.stream.pipe(fs.createWriteStream('output.pdf'));
}Example 5: HTML String Input
const { generatePdf } = require('webpage2pdf');
const html = `
<html>
<head><title>Test</title></head>
<body><h1>Hello World</h1></body>
</html>
`;
const result = await generatePdf(html, './output.pdf');Example 6: Multi-page Merge
const { generatePdf } = require('webpage2pdf');
const urls = [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
];
// Merge multiple pages into one PDF
const result = await generatePdf(urls, './combined.pdf', {
pageSize: 'A4',
waitTime: 5000
});Note: The current multi-page merge uses simple Buffer concatenation, which may not properly handle complex PDF structures. For professional PDF merging (preserving bookmarks, table of contents, etc.), it's recommended to:
- Use
pdf-libor similar libraries to implement merge logic - Generate PDFs separately first, then merge using professional tools
Example 7: Error Ignoring
const { generatePdf } = require('webpage2pdf');
const result = await generatePdf('https://example.com', './output.pdf', {
ignore: ['timeout', /network error/i], // Ignore specific errors
debug: true // Enable debug mode
});Example 8: Custom Margins and Scaling
const { generatePdf } = require('webpage2pdf');
const result = await generatePdf('https://example.com', './output.pdf', {
margin: { top: '20mm', right: '15mm', bottom: '20mm', left: '15mm' },
scale: 0.9, // Scale to 90%
printBackground: true // Print background
});Example 9: Batch Conversion
const { generatePdf, setVerbose } = require('webpage2pdf');
// Disable verbose logging
setVerbose(false);
const urls = [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
];
async function batchConvert() {
for (const url of urls) {
const result = await generatePdf(url, `./${Date.now()}.pdf`);
console.log(result.success ? '✓' : '✗', url);
}
}
batchConvert();Supported Page Sizes
A4: 210mm × 297mm (Standard A4)A4_PRINT: 216mm × 291mm (A4 Print Size)A3: 297mm × 420mm (A3)LETTER: 8.5in × 11in (US Letter)
Technical Details
- Uses Puppeteer for webpage rendering and PDF generation
- Prefers system Chrome (if available), otherwise uses Puppeteer's bundled Chromium
- Supports multiple input types: URL, HTML string, Buffer, Stream, URL array
- Supports stream output: can return Stream instead of just files
- Supports multi-page merging: can merge multiple webpages into one PDF (currently uses simple concatenation, suitable for simple scenarios)
- Supports custom headers (for authenticated pages)
- Supports waiting for specific elements (for dynamic content)
- Supports error ignoring mechanism (can ignore specific errors)
- Default filename: If
-oparameter is not specified, automatically uses page'sdocument.titleas filename - Timestamp format: Timestamp in filename format is
YYYYMMDDHHmm(e.g.,202512251539)
New Features
✨ Stream Output
Supports returning Stream, suitable for pipe operations and stream processing:
const result = await generatePdf('https://example.com', null);
result.stream.pipe(fs.createWriteStream('output.pdf'));✨ Multiple Input Types
- URL:
'https://example.com' - HTML String:
'<html>...</html>' - HTML Buffer:
Buffer.from('<html>...</html>') - HTML Stream:
fs.createReadStream('input.html') - URL Array:
['url1', 'url2'](multi-page merge, currently uses simple concatenation)
✨ Error Ignoring
Can ignore specific errors to avoid interrupting the flow due to non-fatal errors:
await generatePdf(url, './output.pdf', {
ignore: ['timeout', /network error/i]
});✨ Enhanced Configuration
- Custom margins:
margin: { top: '20mm', right: '15mm', ... } - Scale factor:
scale: 0.9 - Background print control:
printBackground: true/false - Debug mode:
debug: true
Notes
- First Run: If using Puppeteer's bundled Chromium, it will be automatically downloaded on first run (~200MB)
- Network Connection: Ensure you can access the target webpage
- Page Loading: For pages with lots of dynamic content, consider increasing wait time or using
--selectoroption - Authenticated Pages: If accessing authenticated pages, use
--headeroption orheadersparameter to add authentication information - Multi-page Merge: Current implementation uses simple Buffer concatenation, suitable for simple scenarios. For professional PDF merging (preserving bookmarks, table of contents, metadata, etc.), it's recommended to use
pdf-libor similar libraries to implement merge logic - Stream Output: When using stream output, ensure timely processing of Stream to avoid excessive memory usage
- HTML Input: When using HTML string input, ensure HTML format is correct, otherwise rendering may fail
Troubleshooting
Issue: Chrome Not Found
Solution:
- macOS: Ensure Google Chrome is installed
- Linux: Install Chromium or use
puppeteerbundled version - Windows: Ensure Chrome is installed
Issue: Page Load Timeout
Solution:
- Increase wait time:
-w 10000orwaitTime: 10000 - Use selector wait:
--selector "#content"orselector: '#content'
Issue: Incomplete PDF Content
Solution:
- Increase wait time
- Use
--selectororselectorto wait for key elements to load - Check if the webpage has dynamically loaded content
Issue: Multi-page Merge Fails or Merged PDF Cannot Open Properly
Solution:
- Ensure all URLs are accessible
- Check network connection
- Use
ignoreoption to ignore non-fatal errors - Important: Current implementation uses simple Buffer concatenation, which may not properly handle complex PDF structures
- For professional PDF merging, it's recommended to:
- Generate PDFs separately first
- Use
pdf-lib,pdf-merger-jsor similar libraries for merging - Or use command-line tools like
pdftk,ghostscriptfor merging
Professional Merge Example:
const { PDFDocument } = require('pdf-lib');
const { generatePdf } = require('webpage2pdf');
const fs = require('fs');
// 1. Generate PDFs separately
const urls = ['url1', 'url2', 'url3'];
const pdfFiles = [];
for (const url of urls) {
const result = await generatePdf(url, `./temp-${Date.now()}.pdf`);
if (result.success) {
pdfFiles.push(result.path);
}
}
// 2. Merge properly using pdf-lib
const mergedPdf = await PDFDocument.create();
for (const pdfPath of pdfFiles) {
const pdfBytes = fs.readFileSync(pdfPath);
const pdf = await PDFDocument.load(pdfBytes);
const pages = await mergedPdf.copyPages(pdf, pdf.getPageIndices());
pages.forEach((page) => mergedPdf.addPage(page));
}
const mergedPdfBytes = await mergedPdf.save();
fs.writeFileSync('./merged.pdf', mergedPdfBytes);
// 3. Clean up temporary files
pdfFiles.forEach(fs.unlinkSync);Issue: Stream Output Not Working
Solution:
- Ensure
outputPathis set tonull - Check the returned
streamobject - Ensure timely processing of Stream to avoid memory leaks
Related Solutions Comparison
If you need to understand other webpage-to-PDF solutions, refer to the following comparison:
Mainstream Solutions Comparison
| Solution | Rendering Quality | JS Support | Resource Usage | Speed | Maintenance | Cost |
|---|---|---|---|---|---|---|
| Puppeteer (This Project) | ⭐⭐⭐⭐⭐ | ✅ Full | High (~200MB) | Medium | ✅ Active | Free |
| Playwright | ⭐⭐⭐⭐⭐ | ✅ Full | High (~300MB) | Medium | ✅ Active | Free |
| wkhtmltopdf | ⭐⭐⭐ | ⚠️ Limited | Low (~50MB) | Fast | ❌ Stopped | Free |
| html2pdf.js | ⭐⭐⭐ | ⚠️ Limited | Low | Fast | ✅ Active | Free |
| Gotenberg | ⭐⭐⭐⭐⭐ | ✅ Full | High | Medium | ✅ Active | Free |
| Prince XML | ⭐⭐⭐⭐⭐ | ⚠️ Limited | Medium | Fast | ✅ Active | 💰 Commercial |
Solution Details
1. Puppeteer (Used by This Project)
- Tech Stack: Node.js + Chrome DevTools Protocol
- Pros: Full modern web support, strong dynamic content handling, high rendering quality, feature-rich
- Cons: High resource usage (~200MB), slower startup
- Use Cases: Modern web applications, need to wait for dynamic content, need high-quality PDF
2. Playwright
- Tech Stack: Node.js + Multi-browser engines
- Pros: Supports multiple browser engines, smarter auto-wait mechanism, more modern API design
- Cons: Higher resource usage, relatively new
- Use Cases: Need cross-browser compatibility, automation testing + PDF generation
3. wkhtmltopdf
- Tech Stack: C++ + Qt WebKit
- Pros: Lightweight (~50MB), fast startup, low resource usage
- Cons: Based on old WebKit, doesn't support modern JavaScript, limited CSS3 support, maintenance stopped
- Use Cases: Simple static pages, batch processing, resource-constrained environments
4. html2pdf.js / jsPDF
- Tech Stack: Pure frontend JavaScript
- Pros: No backend support needed, client-side generation, lightweight
- Cons: Average rendering quality (Canvas-based), doesn't support complex CSS, limited page control
- Use Cases: Simple page conversion, no backend support needed, client-side generation
5. Gotenberg
- Tech Stack: Docker + Chromium
- Pros: Containerized deployment, Chromium-based high rendering quality, RESTful API
- Cons: Requires Docker environment, requires server resources
- Use Cases: Microservice architecture, containerized deployment, need API interface
Selection Advice
- Modern Web Applications (React/Vue/Angular): Recommend Puppeteer or Playwright
- Simple Static Pages: Recommend wkhtmltopdf or html2pdf.js
- Batch Processing: Choose wkhtmltopdf (simple) or Puppeteer (complex) based on complexity
- Microservice Architecture: Recommend Gotenberg
- Frontend Direct Generation: Recommend html2pdf.js or jsPDF
- Professional Typesetting Needs: Recommend Prince XML (commercial)
Advantages of This Project (webpage2pdf)
Based on Puppeteer, especially suitable for:
- ✅ Modern web applications (React/Vue/Angular)
- ✅ Need to wait for dynamic content loading
- ✅ Need high-quality PDF output
- ✅ Node.js environment
Differentiating Features:
- Supports both command-line and function calls
- Supports stream output
- Supports multiple input types (URL, HTML, Buffer, Stream)
- Supports multi-page merging
- User-friendly API design
For more detailed comparison, refer to the "Related Solutions Comparison" section in the README.
Language
License
MIT
Contributing
Issues and Pull Requests are welcome!