JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 74
  • Score
    100M100P100Q109838F
  • License Apache-2.0

WebGPU lock-free queue assets for flat and DAG-ready GPU job scheduling.

Package Exports

  • @plasius/gpu-lock-free-queue
  • @plasius/gpu-lock-free-queue/dag-queue.wgsl
  • @plasius/gpu-lock-free-queue/package.json
  • @plasius/gpu-lock-free-queue/queue.wgsl

Readme

@plasius/gpu-lock-free-queue

npm version Build Status coverage License Code of Conduct Security Policy Changelog

CI license

A WebGPU lock-free queue package that now ships both a flat MPMC ring queue and a DAG-ready scheduler asset with dependency-aware ready lanes.

Apache-2.0. ESM + CJS builds. WGSL assets are published in dist/.

Install

npm install @plasius/gpu-lock-free-queue

Usage

import { loadQueueWgsl, queueWgslUrl } from "@plasius/gpu-lock-free-queue";

const shaderCode = await loadQueueWgsl();
// Or, fetch the WGSL file directly:
// const shaderCode = await fetch(queueWgslUrl).then((res) => res.text());
import {
  createDagJobGraph,
  loadDagQueueWgsl,
  loadSchedulerWgsl,
} from "@plasius/gpu-lock-free-queue";

const graph = createDagJobGraph([
  { id: "g-buffer", priority: 4 },
  { id: "shadow", priority: 3 },
  { id: "lighting", dependencies: ["g-buffer", "shadow"], priority: 2 },
]);

console.log(graph.roots);
console.log(graph.priorityLanes);
const dagSchedulerWgsl = await loadDagQueueWgsl();
const selectedWgsl = await loadSchedulerWgsl({ mode: graph.mode });

What this is

  • Lock-free multi-producer, multi-consumer ring queue on the GPU.
  • Multi-root DAG-ready scheduler asset with priority-aware ready queues.
  • Uses per-slot sequence numbers to avoid ABA for slots within a 32-bit epoch.
  • Fixed-size job metadata with payload offsets into a caller-managed data arena or buffer.

Scheduler assets

  • queue.wgsl: flat lock-free ring queue, compatible with the original worker runtime.
  • dag-queue.wgsl: dependency-aware scheduler asset with multi-root publishing, priority ready lanes, and downstream unlock hooks via complete_job(...).

Both assets remain lock-free. Workers pop runnable jobs without blocking, and DAG jobs unlock downstream work via atomics when their dependency count reaches zero.

The JS graph helper is the canonical preflight contract for DAG metadata. It returns:

  • jobIds for stable upload order
  • roots for the initial runnable set
  • topologicalOrder for validation and planning
  • priorityLanes so callers can size ready queues per priority bucket
  • per-job dependencies, dependents, dependencyCount, unresolvedDependencyCount, and dependentCount

Buffer layout (breaking change in v0.4.0)

Bindings are:

  1. @binding(0) queue header: { head, tail, capacity, mask }
  2. @binding(1) slot array (Slot with seq, job_type, payload_offset, payload_words)
  3. @binding(2) input jobs (array<JobMeta> with job_type, payload_offset, payload_words)
  4. @binding(3) output jobs (array<JobMeta> with job_type, payload_offset, payload_words)
  5. @binding(4) input payloads (array<u32>, payload data referenced by input_jobs.payload_offset)
  6. @binding(5) output payloads (array<u32>, length job_count * output_stride)
  7. @binding(6) status flags (array<u32>, length job_count)
  8. @binding(7) params (Params with job_count, output_stride)

output_stride is the per-job output stride (u32 words) used when copying payloads into output_payloads.

Limitations

  • Sequence counters are 32-bit. At extreme throughput over a long time, counters wrap and ABA can reappear. If you need true long-running safety, consider a reset protocol, sharding, or a future 64-bit atomic extension.
  • Payload lifetimes are managed by the caller. Ensure payload buffers remain valid until consumers finish, or use frame-bounded arenas/generation handles.
  • The DAG scheduler asset introduces extra buffers for job state and dependency lists; callers still need to build/upload those buffers explicitly.

Run the demo

WebGPU requires a secure context. Use a local server, for example:

python3 -m http.server

Then open http://localhost:8000 and check the console/output.

Build Outputs

npm run build emits dist/index.js, dist/index.cjs, dist/queue.wgsl, and dist/dag-queue.wgsl.

Tests

npm run test:unit
npm run test:coverage
npm run test:e2e

Development Checks

npm run lint
npm run typecheck
npm run test:coverage
npm run build
npm run pack:check

Files

  • demo/index.html: Loads the demo.
  • demo/main.js: WebGPU setup, enqueue/dequeue test, FFT spectrogram, and randomness heuristics.
  • src/queue.wgsl: Flat lock-free queue implementation.
  • src/dag-queue.wgsl: DAG-ready scheduler implementation.
  • src/index.js: Package entry point for loading scheduler assets and normalizing DAG graphs.

Architecture Docs

  • docs/adrs/adr-0004-multi-root-dag-ready-queues.md
  • docs/tdrs/tdr-0001-dag-scheduler-contract.md
  • docs/design/dag-scheduler-design.md

Payload shape

Payloads are variable-length chunks stored in a caller-managed buffer. Each job specifies job_type, payload_offset, and payload_words in input_jobs; dequeue copies payloads from input_payloads into output_payloads using output_stride and mirrors the metadata into output_jobs. If you need f32, store bitcast<u32>(value) and reinterpret on the consumer side.