JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 15
  • Score
    100M100P100Q35632F
  • License MIT OR Apache-2.0

High-performance WebAssembly library for Apache Arrow, Feather, and Parquet data with zero-copy semantics, LZ4 compression, and comprehensive model-based testing

Package Exports

  • arrow-rs-wasm
  • arrow-rs-wasm/bundler
  • arrow-rs-wasm/node
  • arrow-rs-wasm/web

Readme

arrow-rs-wasm

High-performance WebAssembly bindings for Apache Arrow that expose zero-copy columnar data—built in Rust, delivered to JavaScript/TypeScript with camelCase APIs.

Features

  • JS-friendly surface – Functions and classes are exported with camelCase/PascalCase via #[wasm_bindgen(js_name = ...)].
  • Zero-copy buffers – Access Arrow vectors through live TypedArray views backed directly by Wasm linear memory.
  • UTF-8 columns – Retrieve values/offsets/validity buffers and lazily decode strings only when needed.
  • Compute helpers – Optional table transforms such as filterTable, takeRows, and sortTable return new handles on the Wasm side.
  • Dual runtime support – Tested with modern Vite/React browser pipelines and Node.js ESM projects.

Getting Started

Installation (local build)

npm i /Users/ods/Documents/arrow-rs-wasm/pkg

Installation (registry build)

npm i arrow-rs-wasm

If consuming from source, regenerate the package with:

# Browser ESM bundle
wasm-pack build --release --target web

# Node.js bundle
wasm-pack build --release --target nodejs

The package ships as an ESM module. Always call the default export (await init()) once before using any other function.

Quick Start (Browser / Vite + React)

  1. Install the local build:

    npm i /Users/ods/Documents/arrow-rs-wasm/pkg
  2. Seed a test handle (for example, in src/test-setup.ts):

    import init, { createTestTable } from 'arrow-rs-wasm';
    
    void (async () => {
      await init();
      (window as any).TEST_HANDLE = createTestTable();
    })();
  3. Render the app.

// src/App.tsx
import { useEffect, useState } from 'react'
import init, { getColumnNames, exportUtf8Buffers, exportPrimitiveBuffers } from 'arrow-rs-wasm'

export default function App() {
  const [ready, setReady] = useState(false)
  const [cols, setCols] = useState<string[]>([])

  useEffect(() => {
    (async () => {
      await init();                                   // Required wasm-bindgen init
      const handle = (window as any).TEST_HANDLE;     // Supply a real handle in your boot script
      const names = await getColumnNames(handle);
      setCols(names);
      setReady(true);

      // Example: peek at buffers
      const primitives = await exportPrimitiveBuffers(handle, 'id');
      console.log('Primitive values buffer', primitives.values);

      const utf = await exportUtf8Buffers(handle, 'name');
      console.log('UTF-8 buffer lengths', utf.values.length, utf.offsets.length);
    })();
  }, []);

  return (
    <main>
      <h1>arrow-rs-wasm (Vite/React)</h1>
      <p>Ready: {String(ready)}</p>
      <pre>{JSON.stringify(cols, null, 2)}</pre>
    </main>
  );
}
// src/main.tsx
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './test-setup'           // registers TEST_HANDLE, etc.
import App from './App'

createRoot(document.getElementById('root')!).render(
  <StrictMode>
    <App />
  </StrictMode>,
)

Quick Start (Node.js)

Build a node-compatible bundle (wasm-pack build --release --target nodejs) if you are consuming locally.

// node-example.mjs
import init, { createTestTable, getColumnNames } from 'arrow-rs-wasm';

await init();                     // Loads the Node.js target Wasm
const handle = createTestTable(); // or hydrate from Arrow IPC/parquet bytes
const columns = await getColumnNames(handle);
console.log('Columns:', columns);

Run with:

node node-example.mjs

API Overview (JS Names)

Export Description
init(options?) Default async initializer (must be awaited once).
initWithOptions(enableConsoleLogs: boolean) Optional second-stage setup for debugging.
setPanicHook() Routes Rust panics to console (no-op unless enabled).
createTestTable() Returns a demo table handle for quick experiments.
readTableFromBytes(data: Uint8Array) Loads Arrow IPC bytes into Wasm, returns a table handle.
writeTableToIpc(handle, enableLz4) Serializes a table handle back to IPC bytes.
getColumnNames(handle) Resolves string[] of column names.
exportPrimitiveBuffers(handle, columnName) Returns `{ values: TypedArray; validity?: Uint8Array
exportUtf8Buffers(handle, columnName) Returns `{ values: Uint8Array; offsets: Int32Array
exportBinaryBuffers(handle, columnName) Returns `{ values: Uint8Array; offsets?: Int32Array; validity?: Uint8Array
filterTable(handle, predicateSpec) Applies predicate, yields new table handle.
takeRows(handle, indices) Selects row subset.
sortTable(handle, sortKeys) Returns sorted table handle.
getTableInfo(handle) Summaries about row/column counts and schema.
freeTable(handle) Releases Wasm-side resources.
getMemoryInfo() Debug helper describing Wasm memory usage.

Internally, Rust keeps snake_case identifiers; the exported JS API uses camelCase/PascalCase thanks to #[wasm_bindgen(js_name = ...)].

Zero-copy Model & UTF-8 Handling

Every buffer exporter returns a view into Wasm linear memory—no copies are made. Treat these objects as live slices:

  • exportPrimitiveBuffersTypedArray (e.g., Int32Array, Float64Array) plus optional Uint8Array validity bitmap.
  • exportUtf8Buffersvalues (Uint8Array of concatenated UTF-8), offsets (Int32Array or BigInt64Array depending on column width), and optional validity.

Lazy decode UTF-8 values only when needed:

const utf = await exportUtf8Buffers(handle, 'name');
const decoder = new TextDecoder();
const i = 0;
const start = utf.offsets[i];
const end = utf.offsets[i + 1];
const firstValue = decoder.decode(utf.values.subarray(start, end));

Memory Growth & View Refresh

Wasm memory may grow during allocations, producing a new backing ArrayBuffer. Existing views detach and report length 0. After any heavy operation (e.g., filterTable, sortTable, or bulk append), regenerate views:

let { values } = await exportPrimitiveBuffers(handle, 'score');
// ... after operations that may allocate
({ values } = await exportPrimitiveBuffers(handle, 'score')); // refresh view

Long-lived UIs should re-request buffers whenever a major action completes.

TypeScript Hints

export type TableHandle = number;

export interface PrimitiveBuffers {
  values: Int8Array | Int16Array | Int32Array | Float32Array | Float64Array;
  validity?: Uint8Array | null;
}

export interface Utf8Buffers {
  values: Uint8Array;
  offsets: Int32Array | BigInt64Array; // LargeUtf8 uses 64-bit offsets
  validity?: Uint8Array | null;
}

Type definitions ship in pkg/arrow_rs_wasm.d.ts. Ensure esModuleInterop or native ESM pipeline for consumers.

End-to-End Examples

Primitive column

const { values, validity } = await exportPrimitiveBuffers(handle, 'id');
const dataView = new DataView(values.buffer, values.byteOffset, values.byteLength);
const first = dataView.getInt32(0, true);
const isValid = !validity || (validity[0] & 1) === 1;

UTF-8 column

const utf = await exportUtf8Buffers(handle, 'name');
const decoder = new TextDecoder();

for (let i = 0; i < utf.offsets.length - 1; i++) {
  const isNull = utf.validity && (utf.validity[Math.floor(i / 8)] & (1 << (i % 8))) === 0;
  if (isNull) continue;

  const start = Number(utf.offsets[i]);
  const end = Number(utf.offsets[i + 1]);
  console.log(decoder.decode(utf.values.subarray(start, end)));
}

Performance Notes

  • Favor filterTable, takeRows, and sortTable (where available) to keep computations inside Wasm and reduce host copies.
  • Avoid eagerly decoding UTF-8 strings; decode on demand at the UI boundary.
  • Reuse handles and refresh views after operations that might trigger Wasm memory growth.

Project Layout & Build

  • /Users/ods/Documents/arrow-rs-wasm/pkg contains the ESM wrapper (arrow_rs_wasm.js), the compiled Wasm artifact, type definitions, and package.json.

  • Rebuild with wasm-pack build --release --target web (browser) or --target nodejs (Node).

  • Install into client projects with:

    npm i /Users/ods/Documents/arrow-rs-wasm/pkg

Testing (E2E)

  • Browser (Vite): npm run dev, open http://localhost:5173, ensure await init() runs, verify column names, decode a UTF-8 element, and confirm zero-copy buffers by comparing .buffer to a cached reference.
  • Chromium DevTools MCP: Automate via a DevTools protocol session—assert camelCase exports, zero-copy buffer equality, and lazy decode results.
  • Memory detach test: Execute an operation that grows memory (e.g., heavy filter), then re-run buffer exporters to rebuild views.

Troubleshooting

  • Module not found: Use the appropriate bundle (--target web for browsers, --target nodejs for Node).
  • Empty buffers: Likely due to Wasm memory growth; call the exporter again.
  • Snake_case exports: Ensure the Rust functions use #[wasm_bindgen(js_name = ...)], then rebuild the pkg directory.

Versioning & License

  • Semantic versioning: MAJOR.MINOR.PATCH.
  • Dual-licensed under MIT and Apache-2.0—use either license at your option.