JSPM

@ruvector/acorn-wasm

0.1.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 19
  • Score
    100M100P100Q66445F
  • License MIT OR Apache-2.0

ACORN predicate-agnostic filtered HNSW in WebAssembly — high-recall vector search with arbitrary metadata filters, for browsers, Cloudflare Workers, Deno, and Bun

Package Exports

  • @ruvector/acorn-wasm
  • @ruvector/acorn-wasm/ruvector_acorn_wasm.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@ruvector/acorn-wasm) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@ruvector/acorn-wasm

ACORN predicate-agnostic filtered HNSW in WebAssembly. High-recall vector search with arbitrary metadata filters, in the browser or at the edge.

npm License

What is ACORN?

ACORN (Patel et al., SIGMOD 2024, arXiv:2403.04871) solves filtered HNSW's recall-collapse problem. Standard post-filter HNSW retrieves k candidates and discards the ones that fail your predicate — but at low selectivity (e.g. 1 % of vectors match) you'd need to retrieve thousands of candidates to expect 10 valid hits, and recall drops to near-zero. ACORN fixes this structurally with two changes:

  1. γ-augmented graph constructionγ × M edges per node instead of M. The denser graph stays navigable even when the predicate prunes most nodes.
  2. Predicate-agnostic traversal — expand all neighbors regardless of predicate. A failing node doesn't enter the result set, but its neighbors enter the candidate frontier. The beam never starves.

Net effect: 96 % recall@10 at 1 % selectivity where post-filter HNSW collapses to near-zero.

Install

npm install @ruvector/acorn-wasm

Usage (browser)

import init, { AcornIndex } from "@ruvector/acorn-wasm";

await init();

const dim = 128;
const n = 5_000;
const vectors = new Float32Array(n * dim);
// ... populate `vectors` with embeddings (n × dim, row-major) ...

// gamma=2 → ACORN-γ (best recall at low selectivity)
// gamma=1 → ACORN-1 (smaller index, fine for moderate selectivity)
const idx = AcornIndex.build(vectors, dim, 2);

const query = new Float32Array(dim);
// ... fill query ...

// Predicate is any JS function (id: number) => boolean
const inStock = (id) => products[id].stockCount > 0;
const results = idx.search(query, 10, inStock);
// → [{ id, distance }, ...]

Usage (Node.js / Bun)

import { AcornIndex } from "@ruvector/acorn-wasm/node/ruvector_acorn_wasm.js";
// no `init()` for the node target

const idx = AcornIndex.build(vectors, 128, 2);
const results = idx.search(query, 10, (id) => metadata[id].published);

Usage (bundlers — Vite, Webpack, Rollup)

import { AcornIndex } from "@ruvector/acorn-wasm/bundler/ruvector_acorn_wasm.js";
// the bundler handles the .wasm import transparently

API

class AcornIndex

AcornIndex.build(vectors, dim, gamma)

Build an index from a flat Float32Array of length n * dim.

Parameter Type Description
vectors Float32Array Row-major matrix of n vectors, each of length dim.
dim number Vector dimensionality.
gamma number Edge multiplier. 1 → ACORN-1 (M=16). 2 → ACORN-γ (M·γ=32, recommended for low selectivity).

Throws if dim == 0, vectors is empty, vectors.length is not a multiple of dim, or gamma == 0.

idx.search(query, k, predicate)

Find the k nearest neighbors of query whose id satisfies predicate. Returns an array of SearchResult ordered ascending by distance.

predicate is invoked as predicate(id: number) => boolean for each node visited during search (≤ ef nodes, ~150 default — bounded). Use it for any metadata filter: equality, range, geo, ACL, composite — there is no schema coupling.

idx.dim (getter, number)

Vector dimensionality of the index.

idx.memoryBytes (getter, number)

Approximate heap size — graph edges + raw vectors, in bytes.

idx.name (getter, string)

Variant label for diagnostics: "ACORN-1 (γ=1, M=16)" or "ACORN-γ (γ=2, M=32)".

interface SearchResult

{
  id: number;       // caller-supplied vector id
  distance: number; // approximate L2² distance
}

version()

Returns the crate version baked at build time.

Recall and performance

Native Rust benchmark (x86_64, n=5K, D=128, k=10):

Selectivity ACORN-γ recall@10 ACORN-γ QPS Flat scan recall Flat scan QPS
50 % 34.5 % 65 K 100.0 % 18 K
10 % 79.7 % 47 K 100.0 % 60 K
1 % 96.0 % 18 K 100.0 % 151 K

The structural win is at low selectivity: ACORN-γ holds high recall as the predicate gets more selective, while post-filter approaches collapse. WASM throughput is typically 30–60 % of native at the same dataset size.

Why use this in the browser

  • Filtered RAG without a server. Query an embedding store with arbitrary metadata filters entirely client-side.
  • Privacy. User vectors never leave the device.
  • Edge runtimes. Cloudflare Workers, Deno Deploy, Vercel Edge — same .wasm, no native binaries.
  • Predicate is just JS. Any (id: number) => boolean function works — your filter logic stays in JS where you already have it.

Sister packages

Source

License

MIT OR Apache-2.0