Package Exports

rust-kgdb
rust-kgdb/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (rust-kgdb) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

rust-kgdb

Production-ready RDF/hypergraph database with 100% W3C SPARQL 1.1 + RDF 1.2 compliance, worst-case optimal joins (WCOJ), and pluggable storage backends.

This npm package provides the high-performance in-memory database. For distributed cluster deployment (1B+ triples, horizontal scaling), contact: gonnect.uk@gmail.com

Deployment Modes

rust-kgdb supports three deployment modes:

Mode	Use Case	Scalability	This Package
In-Memory	Development, embedded apps, testing	Single node, volatile	✅ Included
Single Node (RocksDB/LMDB)	Production, persistence needed	Single node, persistent	Via Rust crate
Distributed Cluster	Enterprise, 1B+ triples	Horizontal scaling, 9+ partitions	Contact us

Distributed Cluster Mode (Enterprise)

For enterprise deployments requiring 1B+ triples and horizontal scaling:

Key Features:

Subject-Anchored Partitioning: All triples for a subject are guaranteed on the same partition for optimal locality
Arrow-Powered OLAP: High-performance analytical queries executed as optimized SQL at scale
Automatic Query Routing: The coordinator intelligently routes queries to the right executors
Kubernetes-Native: StatefulSet-based executors with automatic failover
Linear Horizontal Scaling: Add more executor pods to scale throughput

How It Works:

Your SPARQL queries work unchanged. For large-scale aggregations, the cluster automatically optimizes execution:

-- Your SPARQL query
SELECT (COUNT(*) AS ?count) (AVG(?salary) AS ?avgSalary)
WHERE {
  ?employee <http://ex/type> <http://ex/Employee> .
  ?employee <http://ex/salary> ?salary .
}

-- Cluster executes as optimized SQL internally
-- Results aggregated across all partitions automatically

Request a demo: gonnect.uk@gmail.com

Why rust-kgdb?

Feature	rust-kgdb	Apache Jena	RDFox
Lookup Speed	2.78 µs	~50 µs	50-100 µs
Memory/Triple	24 bytes	50-60 bytes	32 bytes
SPARQL 1.1	100%	100%	95%
RDF 1.2	100%	Partial	No
WCOJ	✅ LeapFrog	❌	❌
Mobile-Ready	✅ iOS/Android	❌	❌

Core Technical Innovations

1. Worst-Case Optimal Joins (WCOJ)

Traditional databases use nested-loop joins with O(n²) to O(n⁴) complexity. rust-kgdb implements the LeapFrog TrieJoin algorithm—a worst-case optimal join that achieves O(n log n) for multi-way joins.

How it works:

Trie Data Structure: Triples indexed hierarchically (S→P→O) using BTreeMap for sorted access
Variable Ordering: Frequency-based analysis orders variables for optimal intersection
LeapFrog Iterator: Binary search across sorted iterators finds intersections without materializing intermediate results

Query: SELECT ?x ?y ?z WHERE { ?x :p ?y . ?y :q ?z . ?x :r ?z }

Nested Loop: O(n³) - examines every combination
WCOJ:        O(n log n) - iterates in sorted order, seeks forward on mismatch

Query Pattern	Before (Nested Loop)	After (WCOJ)	Speedup
3-way star	O(n³)	O(n log n)	50-100x
4+ way complex	O(n⁴)	O(n log n)	100-1000x
Chain queries	O(n²)	O(n log n)	10-20x

2. Sparse Matrix Engine (CSR Format)

Binary relations (e.g., foaf:knows, rdfs:subClassOf) are converted to Compressed Sparse Row (CSR) matrices for cache-efficient join evaluation:

Memory: O(nnz) where nnz = number of edges (not O(n²))
Matrix Multiplication: Replaces nested-loop joins
Transitive Closure: Semi-naive Δ-matrix evaluation (not iterated powers)

// Traditional: O(n²) nested loops
for (s, p, o) in triples { ... }

// CSR Matrix: O(nnz) cache-friendly iteration
row_ptr[i] → col_indices[j] → values[j]

Used for: RDFS/OWL reasoning, transitive closure, Datalog evaluation.

3. SIMD + PGO Compiler Optimizations

Zero code changes—pure compiler-level performance gains.

Optimization	Technology	Effect
SIMD Vectorization	AVX2/BMI2 (Intel), NEON (ARM)	8-wide parallel operations
Profile-Guided Optimization	LLVM PGO	Hot path optimization, branch prediction
Link-Time Optimization	LTO (fat)	Cross-crate inlining, dead code elimination

Benchmark Results (LUBM, Intel Skylake):

Query	Before	After (SIMD+PGO)	Improvement
Q5: 2-hop chain	230ms	53ms	77% faster
Q3: 3-way star	177ms	62ms	65% faster
Q4: 3-hop chain	254ms	101ms	60% faster
Q8: Triangle	410ms	193ms	53% faster
Q7: Hierarchy	343ms	198ms	42% faster
Q6: 6-way complex	641ms	464ms	28% faster
Q2: 5-way star	234ms	183ms	22% faster
Q1: 4-way star	283ms	258ms	9% faster

Average speedup: 44.5% across all queries.

4. Quad Indexing (SPOC)

Four complementary indexes enable O(1) pattern matching regardless of query shape:

Index	Pattern	Use Case
SPOC	`(?s, ?p, ?o, ?g)`	Subject-centric queries
POCS	`(?p, ?o, ?c, ?s)`	Property enumeration
OCSP	`(?o, ?c, ?s, ?p)`	Object lookups (reverse links)
CSPO	`(?c, ?s, ?p, ?o)`	Named graph iteration

Storage Backends

rust-kgdb uses a pluggable storage architecture. Default is in-memory (zero configuration). For persistence, enable RocksDB.

Backend	Feature Flag	Use Case	Status
InMemory	`default`	Development, testing, embedded	✅ Production Ready
RocksDB	`rocksdb-backend`	Production, large datasets	✅ 61 tests passing
LMDB	`lmdb-backend`	Read-heavy workloads	✅ 31 tests passing

InMemory (Default)

Zero configuration, maximum performance. Data is volatile (lost on process exit).

High-Performance Data Structures:

Component	Structure	Why
Triple Store	`DashMap`	Lock-free concurrent hash map, 100K pre-allocation
WCOJ Trie	`BTreeMap`	Sorted iteration for LeapFrog intersection
Dictionary	`FxHashSet`	String interning with rustc-optimized hashing
Hypergraph	`FxHashMap`	Fast node→edge adjacency lists
Reasoning	`AHashMap`	RDFS/OWL inference with DoS-resistant hashing
Datalog	`FxHashMap`	Semi-naive evaluation with delta propagation

Why these structures enable sub-microsecond performance:

DashMap: Sharded locks (16 shards default) → near-linear scaling on multi-core
FxHashMap: Rust compiler's hash function → 30% faster than std HashMap
BTreeMap: O(log n) ordered iteration → enables binary search in LeapFrog
Pre-allocation: 100K capacity avoids rehashing during bulk inserts

use storage::{QuadStore, InMemoryBackend};

let store = QuadStore::new(InMemoryBackend::new());
// Ultra-fast: 2.78 µs lookups, zero disk I/O

RocksDB (Persistent)

LSM-tree based storage with ACID transactions. Tested with 61 comprehensive tests.

# Cargo.toml - Enable RocksDB backend
[dependencies]
storage = { version = "0.1.10", features = ["rocksdb-backend"] }

use storage::{QuadStore, RocksDbBackend};

// Create persistent database
let backend = RocksDbBackend::new("/path/to/data")?;
let store = QuadStore::new(backend);

// Features:
// - ACID transactions
// - Snappy compression (automatic)
// - Crash recovery
// - Range & prefix scanning
// - 1MB+ value support

// Force sync to disk
store.flush()?;

RocksDB Test Coverage:

Basic CRUD operations (14 tests)
Range scanning (8 tests)
Prefix scanning (6 tests)
Batch operations (8 tests)
Transactions (8 tests)
Concurrent access (5 tests)
Unicode & binary data (4 tests)
Large key/value handling (8 tests)

LMDB (Memory-Mapped Persistent)

B+tree based storage with memory-mapped I/O (via heed crate). Optimized for read-heavy workloads with MVCC (Multi-Version Concurrency Control). Tested with 31 comprehensive tests.

# Cargo.toml - Enable LMDB backend
[dependencies]
storage = { version = "0.1.12", features = ["lmdb-backend"] }

use storage::{QuadStore, LmdbBackend};

// Create persistent database (default 10GB map size)
let backend = LmdbBackend::new("/path/to/data")?;
let store = QuadStore::new(backend);

// Or with custom map size (1GB)
let backend = LmdbBackend::with_map_size("/path/to/data", 1024 * 1024 * 1024)?;

// Features:
// - Memory-mapped I/O (zero-copy reads)
// - MVCC for concurrent readers
// - Crash-safe ACID transactions
// - Range & prefix scanning
// - Excellent for read-heavy workloads

// Sync to disk
store.flush()?;

When to use LMDB vs RocksDB:

Characteristic	LMDB	RocksDB
Read Performance	✅ Faster (memory-mapped)	Good
Write Performance	Good	✅ Faster (LSM-tree)
Concurrent Readers	✅ Unlimited	Limited by locks
Write Amplification	Low	Higher (compaction)
Memory Usage	Higher (map size)	Lower (cache-based)
Best For	Read-heavy, OLAP	Write-heavy, OLTP

LMDB Test Coverage:

Basic CRUD operations (8 tests)
Range scanning (4 tests)
Prefix scanning (3 tests)
Batch operations (3 tests)
Large key/value handling (4 tests)
Concurrent access (4 tests)
Statistics & flush (3 tests)
Edge cases (2 tests)

TypeScript SDK

The npm package uses the in-memory backend—ideal for:

Knowledge graph queries
SPARQL execution
Data transformation pipelines
Embedded applications

import { GraphDB } from 'rust-kgdb'

// In-memory database (default, no configuration needed)
const db = new GraphDB('http://example.org/app')

// For persistence, export via CONSTRUCT:
const ntriples = db.queryConstruct('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')
fs.writeFileSync('backup.nt', ntriples)

Installation

npm install rust-kgdb

Platform Support (v0.2.1)

Platform	Architecture	Status	Notes
macOS	Intel (x64)	✅ Works out of the box	Pre-built binary included
macOS	Apple Silicon (arm64)	⏳ v0.2.2	Coming soon
Linux	x64	⏳ v0.2.2	Coming soon
Linux	arm64	⏳ v0.2.2	Coming soon
Windows	x64	⏳ v0.2.2	Coming soon

This release (v0.2.1) includes pre-built binary for macOS x64 only. Other platforms will be added in the next release.

Quick Start

Complete Working Example

import { GraphDB } from 'rust-kgdb'

// 1. Create database
const db = new GraphDB('http://example.org/myapp')

// 2. Load data (Turtle format)
db.loadTtl(`
  @prefix foaf: <http://xmlns.com/foaf/0.1/> .
  @prefix ex: <http://example.org/> .

  ex:alice a foaf:Person ;
           foaf:name "Alice" ;
           foaf:age 30 ;
           foaf:knows ex:bob, ex:charlie .

  ex:bob a foaf:Person ;
         foaf:name "Bob" ;
         foaf:age 25 ;
         foaf:knows ex:charlie .

  ex:charlie a foaf:Person ;
             foaf:name "Charlie" ;
             foaf:age 35 .
`, null)

// 3. Query: Find friends-of-friends (WCOJ optimized!)
const fof = db.querySelect(`
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
  PREFIX ex: <http://example.org/>

  SELECT ?person ?friend ?fof WHERE {
    ?person foaf:knows ?friend .
    ?friend foaf:knows ?fof .
    FILTER(?person != ?fof)
  }
`)
console.log('Friends of Friends:', fof)
// [{ person: 'ex:alice', friend: 'ex:bob', fof: 'ex:charlie' }]

// 4. Aggregation: Average age
const stats = db.querySelect(`
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>

  SELECT (COUNT(?p) AS ?count) (AVG(?age) AS ?avgAge) WHERE {
    ?p a foaf:Person ; foaf:age ?age .
  }
`)
console.log('Stats:', stats)
// [{ count: '3', avgAge: '30.0' }]

// 5. ASK query
const hasAlice = db.queryAsk(`
  PREFIX ex: <http://example.org/>
  ASK { ex:alice a <http://xmlns.com/foaf/0.1/Person> }
`)
console.log('Has Alice?', hasAlice)  // true

// 6. CONSTRUCT query
const graph = db.queryConstruct(`
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
  PREFIX ex: <http://example.org/>

  CONSTRUCT { ?p foaf:knows ?f }
  WHERE { ?p foaf:knows ?f }
`)
console.log('Extracted graph:', graph)

// 7. Count and cleanup
console.log('Triple count:', db.count())  // 11
db.clear()

Save to File

import { writeFileSync } from 'fs'

// Save as N-Triples
const db = new GraphDB('http://example.org/export')
db.loadTtl(`<http://example.org/s> <http://example.org/p> "value" .`, null)

const ntriples = db.queryConstruct(`CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }`)
writeFileSync('output.nt', ntriples)

SPARQL 1.1 Features (100% W3C Compliant)

Query Forms

// SELECT - return bindings
db.querySelect('SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10')

// ASK - boolean existence check
db.queryAsk('ASK { <http://example.org/x> ?p ?o }')

// CONSTRUCT - build new graph
db.queryConstruct('CONSTRUCT { ?s <http://new/prop> ?o } WHERE { ?s ?p ?o }')

Aggregates

db.querySelect(`
  SELECT ?type (COUNT(*) AS ?count) (AVG(?value) AS ?avg)
  WHERE { ?s a ?type ; <http://ex/value> ?value }
  GROUP BY ?type
  HAVING (COUNT(*) > 5)
  ORDER BY DESC(?count)
`)

Property Paths

// Transitive closure (rdfs:subClassOf*)
db.querySelect('SELECT ?class WHERE { ?class rdfs:subClassOf* <http://top/Class> }')

// Alternative paths
db.querySelect('SELECT ?name WHERE { ?x (foaf:name|rdfs:label) ?name }')

// Sequence paths
db.querySelect('SELECT ?grandparent WHERE { ?x foaf:parent/foaf:parent ?grandparent }')

Named Graphs

// Load into named graph
db.loadTtl('<http://s> <http://p> "o" .', 'http://example.org/graph1')

// Query specific graph
db.querySelect(`
  SELECT ?s ?p ?o WHERE {
    GRAPH <http://example.org/graph1> { ?s ?p ?o }
  }
`)

UPDATE Operations

// INSERT DATA - Add new triples
db.updateInsert(`
  PREFIX ex: <http://example.org/>
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>

  INSERT DATA {
    ex:david a foaf:Person ;
             foaf:name "David" ;
             foaf:age 28 ;
             foaf:email "david@example.org" .

    ex:project1 ex:hasLead ex:david ;
                ex:budget 50000 ;
                ex:status "active" .
  }
`)

// Verify insert
const count = db.count()
console.log(`Total triples after insert: ${count}`)

// DELETE WHERE - Remove matching triples
db.updateDelete(`
  PREFIX ex: <http://example.org/>
  DELETE WHERE { ?s ex:status "completed" }
`)

Bulk Data Loading Example

import { GraphDB } from 'rust-kgdb'
import { readFileSync } from 'fs'

const db = new GraphDB('http://example.org/bulk-load')

// Load Turtle file
const ttlData = readFileSync('data/knowledge-graph.ttl', 'utf-8')
db.loadTtl(ttlData, null)  // null = default graph

// Load into named graph
const orgData = readFileSync('data/organization.ttl', 'utf-8')
db.loadTtl(orgData, 'http://example.org/graphs/org')

// Load N-Triples format
const ntData = readFileSync('data/triples.nt', 'utf-8')
db.loadNTriples(ntData, null)

console.log(`Loaded ${db.count()} triples`)

// Query across all graphs
const results = db.querySelect(`
  SELECT ?g (COUNT(*) AS ?count) WHERE {
    GRAPH ?g { ?s ?p ?o }
  }
  GROUP BY ?g
`)
console.log('Triples per graph:', results)

Sample Application

Knowledge Graph Demo

A complete, production-ready sample application demonstrating enterprise knowledge graph capabilities is available in the repository.

Location: examples/knowledge-graph-demo/

Features Demonstrated:

Complete organizational knowledge graph (employees, departments, projects, skills)
SPARQL SELECT queries with star and chain patterns (WCOJ-optimized)
Aggregations (COUNT, AVG, GROUP BY, HAVING)
Property paths for transitive closure (organizational hierarchy)
SPARQL ASK and CONSTRUCT queries
Named graphs for multi-tenant data isolation
Data export to Turtle format

Run the Demo:

cd examples/knowledge-graph-demo
npm install
npm start

Sample Output:

The demo creates a realistic knowledge graph with:

5 employees across 4 departments
13 technical and soft skills
2 software projects
Reporting hierarchies and salary data
Named graph for sensitive compensation data

Example Query from Demo (finds all direct and indirect reports):

const pathQuery = `
  PREFIX ex: <http://example.org/>
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>

  SELECT ?employee ?name WHERE {
    ?employee ex:reportsTo+ ex:alice .  # Transitive closure
    ?employee foaf:name ?name .
  }
  ORDER BY ?name
`
const results = db.querySelect(pathQuery)

Learn More: See the demo README for full documentation, query examples, and how to customize the knowledge graph.

API Reference

GraphDB Class

class GraphDB {
  constructor(baseUri: string)           // Create with base URI
  static inMemory(): GraphDB             // Create anonymous in-memory DB

  // Data Loading
  loadTtl(data: string, graph: string | null): void
  loadNTriples(data: string, graph: string | null): void

  // SPARQL Queries (WCOJ-optimized)
  querySelect(sparql: string): Array<Record<string, string>>
  queryAsk(sparql: string): boolean
  queryConstruct(sparql: string): string  // Returns N-Triples

  // SPARQL Updates
  updateInsert(sparql: string): void
  updateDelete(sparql: string): void

  // Database Operations
  count(): number
  clear(): void
  getVersion(): string
}

Node Class

class Node {
  static iri(uri: string): Node
  static literal(value: string): Node
  static langLiteral(value: string, lang: string): Node
  static typedLiteral(value: string, datatype: string): Node
  static integer(value: number): Node
  static boolean(value: boolean): Node
  static blank(id: string): Node
}

Performance Characteristics

Complexity Analysis

Operation	Complexity	Notes
Triple lookup	O(1)	Hash-based SPOC index
Pattern scan	O(k)	k = matching triples
Star join (WCOJ)	O(n log n)	LeapFrog intersection
Complex join (WCOJ)	O(n log n)	Trie-based
Transitive closure	O(n²) worst	CSR matrix optimization
Bulk insert	O(n)	Batch indexing

Memory Layout

Triple: 24 bytes
├── Subject:   8 bytes (dictionary ID)
├── Predicate: 8 bytes (dictionary ID)
└── Object:    8 bytes (dictionary ID)

String Interning: All URIs/literals stored once in Dictionary
Index Overhead: ~4x base triple size (4 indexes)
Total: ~120 bytes/triple including indexes

Performance Benchmarks

By Deployment Mode

Mode	Lookup	Insert	Memory	Dataset Size
In-Memory (npm)	2.78 µs	146K/sec	24 bytes/triple	<10M triples
Single Node (RocksDB)	5-10 µs	100K/sec	On-disk	<100M triples
Distributed Cluster	10-50 µs	500K+/sec*	Distributed	1B+ triples

*Aggregate throughput across all executors with HDRF partitioning

SIMD + PGO Query Performance (LUBM Benchmark)

Query	Pattern	Time	Improvement
Q5	2-hop chain	53ms	77% faster
Q3	3-way star	62ms	65% faster
Q4	3-hop chain	101ms	60% faster
Q8	Triangle	193ms	53% faster
Q7	Hierarchy	198ms	42% faster

Average: 44.5% speedup with zero code changes (compiler optimizations only).

Version History

v0.2.2 (2025-12-08) - Enhanced Documentation

Added comprehensive INSERT DATA examples with PREFIX syntax
Added bulk data loading example with named graphs
Enhanced SPARQL UPDATE section with real-world patterns
Improved documentation for data import workflows

v0.2.1 (2025-12-08) - npm Platform Fix

Fixed native module loading for platform-specific binaries
This release includes pre-built binary for macOS x64 only
Other platforms coming in next release

v0.2.0 (2025-12-08) - Distributed Cluster Support

NEW: Distributed cluster architecture with HDRF partitioning
Subject-Hash Filter for accurate COUNT deduplication across replicas
Arrow-powered OLAP query path for high-performance analytical queries
Coordinator-Executor pattern with gRPC communication
9-partition default for optimal data distribution
Contact for cluster deployment: gonnect.uk@gmail.com
Coming soon: Embedding support for semantic search (v0.3.0)

v0.1.12 (2025-12-01) - LMDB Backend Release

LMDB storage backend fully implemented (31 tests passing)
Memory-mapped I/O for optimal read performance
MVCC concurrency for unlimited concurrent readers
Complete LMDB vs RocksDB comparison documentation
Sample application with 87 triples demonstrating all features

v0.1.9 (2025-12-01) - SIMD + PGO Release

44.5% average speedup via SIMD + PGO compiler optimizations
WCOJ execution with LeapFrog TrieJoin
Release automation infrastructure
All packages updated to gonnect-uk namespace

v0.1.8 (2025-12-01) - WCOJ Execution

WCOJ execution path activated
Variable ordering analysis for optimal joins
577 tests passing

v0.1.7 (2025-11-30)

Query optimizer with automatic strategy selection
WCOJ algorithm integration (planning phase)

v0.1.3 (2025-11-18)

Initial TypeScript SDK
100% W3C SPARQL 1.1 compliance
100% W3C RDF 1.2 compliance

Use Cases

Domain	Application
Knowledge Graphs	Enterprise ontologies, taxonomies
Semantic Search	Structured queries over unstructured data
Data Integration	ETL with SPARQL CONSTRUCT
Compliance	SHACL validation, provenance tracking
Graph Analytics	Pattern detection, community analysis
Mobile Apps	Embedded RDF on iOS/Android

License

Apache License 2.0

Built with Rust + NAPI-RS

rust-kgdb

Package Exports

Readme

rust-kgdb

Deployment Modes

Distributed Cluster Mode (Enterprise)

Why rust-kgdb?

Core Technical Innovations

1. Worst-Case Optimal Joins (WCOJ)

2. Sparse Matrix Engine (CSR Format)

3. SIMD + PGO Compiler Optimizations

4. Quad Indexing (SPOC)

Storage Backends

InMemory (Default)

RocksDB (Persistent)

LMDB (Memory-Mapped Persistent)

TypeScript SDK

Installation

Platform Support (v0.2.1)

Quick Start

Complete Working Example

Save to File

SPARQL 1.1 Features (100% W3C Compliant)

Query Forms

Aggregates

Property Paths

Named Graphs

UPDATE Operations

Bulk Data Loading Example

Sample Application

Knowledge Graph Demo

API Reference

GraphDB Class

Node Class

Performance Characteristics

Complexity Analysis

Memory Layout

Performance Benchmarks

By Deployment Mode

SIMD + PGO Query Performance (LUBM Benchmark)

Version History

v0.2.2 (2025-12-08) - Enhanced Documentation

v0.2.1 (2025-12-08) - npm Platform Fix

v0.2.0 (2025-12-08) - Distributed Cluster Support

v0.1.12 (2025-12-01) - LMDB Backend Release

v0.1.9 (2025-12-01) - SIMD + PGO Release

v0.1.8 (2025-12-01) - WCOJ Execution

v0.1.7 (2025-11-30)

v0.1.3 (2025-11-18)

Use Cases

Links

License