JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 5
  • Score
    100M100P100Q35350F
  • License MIT

Ultra-fast streaming CSV and XLSX parser for Node.js with native C++ backend, non-blocking threads, and low memory usage for huge files

Package Exports

  • ultratab

Readme

UltraTab

npm version License: MIT Node

Ultra-fast native CSV and XLSX streaming parser for Node.js. Built as a C++ addon with SIMD acceleration, background-thread parsing, and bounded memory for large files (100MB–10GB+).

Features

  • Streaming: Parses from disk in chunks; does not load entire files into memory
  • Non-blocking: Parsing runs on a C++ background thread; the Node event loop stays responsive
  • Two APIs: Row-based csv() (string[][]) and typed columnar csvColumns() (TypedArrays)
  • Typed output: int32, int64, float64, bool → Int32Array, BigInt64Array, Float64Array, Uint8Array
  • XLSX support: xlsx() parses .xlsx files in low-memory streaming mode
  • SIMD acceleration: AVX2/SSE2 on x86_64 (Linux/Windows); scalar fallback on macOS

Installation

npm install ultratab

Requirements: Node.js >= 18, CMake >= 3.15, C++17 compiler (GCC, Clang, or MSVC)

UltraTab is written in TypeScript and ships with full type definitions. Use import (ESM) or require() (CommonJS).

Quick Start

After npm install ultratab, try the example:

node node_modules/ultratab/examples/csv-demo.js

CSV (row-based)

import { csv } from "ultratab";

for await (const batch of csv("data.csv", { batchSize: 10000, headers: true })) {
  console.log("Rows in batch:", batch.length);
  for (const row of batch) {
    console.log(row); // string[]
  }
}

CSV (typed columnar)

import { csvColumns } from "ultratab";

for await (const batch of csvColumns("data.csv", {
  headers: true,
  schema: { id: "int32", amount: "float64", active: "bool" },
  select: ["id", "amount", "active"],
})) {
  const ids = batch.columns.id;       // Int32Array
  const amounts = batch.columns.amount; // Float64Array
  const nulls = batch.nullMask?.amount; // 1 = null
  for (let i = 0; i < batch.rows; i++) {
    if (nulls?.[i]) continue;
    console.log(ids[i], amounts[i]);
  }
}

XLSX

import { xlsx } from "ultratab";

for await (const batch of xlsx("data.xlsx", {
  sheet: 1,
  headers: true,
  batchSize: 5000,
})) {
  console.log("Headers:", batch.headers);
  console.log("Rows:", batch.rowsCount);
}

API Reference

csv(path, options?)

Returns AsyncIterable&lt;string[][]&gt;. Each batch is an array of rows; each row is string[].

Option Type Default Description
delimiter string "," Field delimiter (use "\t" for TSV)
quote string '"' Quote character
headers boolean false Skip first row as header
batchSize number 10000 Rows per batch (1–10,000,000)
maxQueueBatches number 2 Max batches in queue (backpressure)
useMmap boolean false Use memory-mapped I/O
readBufferSize number 262144 Read buffer size in bytes

csvColumns(path, options?)

Returns AsyncIterable&lt;ColumnarBatch&gt;. Each batch has headers, columns, nullMask, and rows.

Option Type Default Description
delimiter string "," Field delimiter
quote string '"' Quote character
headers boolean true First row is header
batchSize number 10000 Rows per batch
select string[] (all) Columns to keep by header name
schema object (string) Per-column: "string", "int32", "int64", "float64", "bool"
nullValues string[] ["","null","NULL"] Strings treated as null
trim boolean false Trim whitespace
typedFallback string "null" On parse failure: "null" or "string"

xlsx(path, options?)

Returns AsyncIterable&lt;XlsxBatchResult&gt;. Options: sheet, headers, batchSize, select, schema, nullValues, trim, typedFallback.

Performance

Designed for large files: minimal allocations, SIMD-accelerated scanning (x86_64), and bounded backpressure. Typical throughput: hundreds of thousands to millions of rows per second depending on schema and hardware.

Platform Compatibility

  • Linux (x86_64): Full SIMD (AVX2/SSE2)
  • Windows (x64): Full SIMD
  • macOS (Intel/ARM): Scalar path (no SIMD flags; still fast)

Build from Source

git clone https://github.com/haykghukasyan/ultratab.git
cd ultratab
npm install
npm run build

Compiler requirements:

  • Windows: Visual Studio Build Tools
  • macOS: Xcode Command Line Tools (xcode-select --install)
  • Linux: build-essential (or equivalent)

Troubleshooting

"Native addon not found"

  • Run npm run build or npm rebuild ultratab
  • Ensure CMake and a C++17 compiler are installed

Build fails on Windows

Build fails on Linux

  • sudo apt-get install cmake build-essential (Debian/Ubuntu)

License

MIT