JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 158
  • Score
    100M100P100Q87802F
  • License MIT

Parse And Write Web Archive Records (WARC) Files Via Node.js

Package Exports

  • node-warc

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (node-warc) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

node-warc

Parse And Write Web ARChive (WARC) files with node.js.

Run npm install node-warc or yarn add node-warc to ge started

npm Package

API

Full API documentation available at n0tan3rd.github.io/node-warc

Example usage

Example 1: Both .warc and .warc.gz

const AutoWARCParser = require('node-warc')

const parser = new AutoWARCParser('<path-to-warcfile>')
parser.on('record', record => { console.log(record) })
parser.on('done', finalRecord => { console.log(finalRecord) })
parser.on('error', error => { console.error(error) })
parser.start()

Example 2: Only .warc.gz

const WARCGzParser = require('node-warc').WARCGzParser

const parser = new WARCGzParser('<path-to-warcfile>')
parser.on('record', record => { console.log(record) })
parser.on('done', finalRecord => { console.log(finalRecord) })
parser.on('error', error => { console.error(error) })
parser.start()

Example 3: Only .warc

const WARCParser = require('node-warc').WARCParser

const parser = new WARCParser('<path-to-warcfile>')
parser.on('record', record => { console.log(record) })
parser.on('done', finalRecord => { console.log(finalRecord) })
parser.on('error', error => { console.error(error) })
parser.start()

Benchmark

UN-GZIPPED

  • 145.9MB (8,026 records) took 2s. Max node process usage 22 MiB
  • 268MB (852 records) took 2s. Max node process usage 77 MiB
  • 2GB (76,980 records) took 21s. Max node process usage 100 MiB
  • 4.8GB (185,662 records) took 1m. Max node process usage 144.3 MiB

GZIPPED

  • 7.7MB (1,269 records) took 297ms. Max node process memory usage 7.1 MiB
  • 819.1MB (34,253 records) took 16s. Max node process memory usage 190.3 MiB
  • 2.3GB (68,020 records) took 45s. Max node process memory usage 197.6 MiB
  • 5.3GB (269,464 records) took 4m. Max node process memory usage 198.2 MiB

JavaScript Style Guide