JSPM

@aigentic/rabitq-wasm

0.1.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 13
  • Score
    100M100P100Q71855F
  • License MIT OR Apache-2.0

RaBitQ 1-bit quantized vector index in WebAssembly — 32× embedding compression with high-recall rerank, for browsers, Cloudflare Workers, Deno, and Bun

Package Exports

  • @aigentic/rabitq-wasm
  • @aigentic/rabitq-wasm/ruvector_rabitq_wasm.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@aigentic/rabitq-wasm) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@ruvector/rabitq-wasm

RaBitQ 1-bit quantized vector index in WebAssembly. Compress embeddings 32× and run approximate nearest-neighbor search in the browser, Cloudflare Workers, Deno, or Bun.

npm License

What is RaBitQ?

RaBitQ is a rotation-based 1-bit vector quantization scheme that compresses each f32 embedding into a single bit per dimension while preserving rank order under L2 distance. A small "rerank pool" of exact-distance computations on the top candidates restores recall.

For a 768-dimensional embedding (~3 KB raw), RaBitQ stores 96 bytes of quantized code plus the rotation matrix — a 32× memory reduction. Search runs in two phases:

  1. Hamming-distance scan over the 1-bit codes — fast, branch-free, ~10× more vectors per cache line than f32.
  2. Exact L2² rerank of the top rerank_factor × k candidates — restores recall.

The rotation is deterministic from (seed, dim, vectors), so the same input always produces bit-identical codes whether you build on x86_64, aarch64, or wasm32.

Install

npm install @ruvector/rabitq-wasm

Usage (browser)

import init, { RabitqIndex } from "@ruvector/rabitq-wasm";

await init();

const dim = 768;
const n = 10_000;
const vectors = new Float32Array(n * dim);
// ... populate `vectors` with your embeddings (n × dim, row-major) ...

// seed = 42 for reproducibility; rerank_factor = 20 is the typical default
const idx = RabitqIndex.build(vectors, dim, 42n, 20);

const query = new Float32Array(dim);
// ... fill query ...

const results = idx.search(query, 10);
// → [{ id: 7421, distance: 0.0023 }, { id: 9011, distance: 0.0041 }, ...]

Usage (Node.js / Bun)

import { RabitqIndex } from "@ruvector/rabitq-wasm/node/ruvector_rabitq_wasm.js";
// no `init()` needed for the node target

const idx = RabitqIndex.build(vectors, 768, 42n, 20);
const results = idx.search(query, 10);

Usage (bundlers — Vite, Webpack, Rollup)

import { RabitqIndex } from "@ruvector/rabitq-wasm/bundler/ruvector_rabitq_wasm.js";
// the bundler handles the .wasm import transparently

API

class RabitqIndex

RabitqIndex.build(vectors, dim, seed, rerankFactor)

Build an index from a flat Float32Array of length n * dim.

Parameter Type Description
vectors Float32Array Row-major matrix of n vectors, each of length dim.
dim number Vector dimensionality.
seed bigint Random rotation seed. Same (seed, dim, vectors) triple → bit-identical codes.
rerankFactor number Multiplier on k for the exact-L2² rerank pool. Typical: 20.

Throws if dim == 0, vectors is empty, or vectors.length is not a multiple of dim.

idx.search(query, k)

Find the k nearest neighbors of query. Returns an array of SearchResult ordered ascending by distance.

idx.len (getter, number)

Number of vectors indexed.

idx.isEmpty (getter, boolean)

true iff no vectors have been indexed.

interface SearchResult

{
  id: number;       // caller-supplied vector id (its row index in `build`)
  distance: number; // approximate L2² distance after rerank
}

version()

Returns the crate version baked at build time.

Why use this in the browser

  • 32× smaller indices. A 100 K × 768 embedding store is ~9.6 MB instead of ~300 MB — fits comfortably in any browser tab.
  • Cache-line-friendly hamming scan. The 1-bit codes pack 64 dimensions into one u64, so the hot path runs at memory bandwidth.
  • Deterministic across architectures. Builds on your x86_64 build server, runs identically on the user's ARM phone or in a Cloudflare Worker.
  • No server. Run RAG, semantic search, or recommendation lookup entirely client-side.

Sister packages

Source

License

MIT OR Apache-2.0