JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 20
  • Score
    100M100P100Q57530F
  • License AGPL-3.0

A MongoDB-like database built on top of Hyperbee with support for indexing

Package Exports

  • hyperbeedeebee

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (hyperbeedeebee) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

hyperbeedeebee

A MongoDB-like database built on top of Hyperbee with support for indexing

WIP: There may be breaking changes in the indexing before the v1.0.0 release, don't use this for anything you don't mind migrating in the future.

Based on this design

Usage

npm i --save hyperbeedeebee
const Hyperbee = require('hyperbee')
// This module handles networking and storage of hypercores for you
const SDK = require('hyper-sdk')
const {DB} = require('hyperbeedeebee')

const {Hypercore} = await SDK()

// Initialize a hypercore for loading data
const core = new Hypercore('example')
// Initialize the Hyperbee you want to use for storing data and indexes
const bee = new Hyperbee(core)

// Create a new DB
const db = new DB(bee)

// Open up a collection of documents and insert a new document
const doc = await db.collection('example').insert({
  hello: 'World!'
})

// doc._id gets set to an ObjectId if you don't specify it
console.log(doc)

// Iterate through data as it's loaded (streaming)
// Usually faster and more memory / CPU efficient
for await (let doc of db.collection('example').find({
  clout: {
    $gt: 9000
  },
})) {
  console.log(doc)
}

// Create an index for properties in documents
// This drastically speeds up queries and is necessary for sorting by fields
await db.collection('example').createIndex('createdAt')

// Get all results in an array
// Can skip some results and limit total for pagination
const killbots = await db.collection('example')
  .find({type: 'killbot'})
  .sort('createdAt', -1)
  .skip(30)
  .limit(100)

// Get a single document that matches the query
const eggbert = await db.collection('example').findOne({name: 'Eggbert'})

Data Types

HyperbeeDeeBee uses MongoDB's BSON data types for encoding data. You can import the bson library bundled with HyperbeeDeeBee using the following code:

const { BSON } = require('hyperbeedeebee')

From there you can access any of the following data types:

Binary,
Code,
DBRef,
Decimal128,
Double,
Int32,
Long,
UUID,
Map,
MaxKey,
MinKey,
ObjectId,
BSONRegExp,
BSONSymbol,
Timestamp

TODO:

  • Sketch up API
  • Insert (with BSON encoding)
  • Find all docs
  • Find by _id
  • Find by field eq (no index)
  • Find by array field includes
  • Find by number field $gt/$gte/$lt/$lte
    • Numbers
    • Dates
  • Find using $in operator
  • Find using $all operator
  • Find using $exists operator
  • Index fields
  • Sort by index (with find)
  • Indexed find by field $eq
  • Flatten array for indexes
  • Get field values from index key without getting the doc
  • Find on fields that aren't indexed
  • Indexed find for $exists
  • Indexed find by number field
  • Indexed find for $in
  • Indexed find for $all
  • Hint API (specify index to use)
  • Test if iterators clean up properly
  • More efficient support for $gt/$gte/$lt/$lte indexes
  • More efficient support for $all indexes
  • More efficient support for $in indexes
  • Detect when data isn't available from peers and emit an error of some sort instead of waiting indefinately.

Important Differences From MongoDB

  • There is a single writer for a hyperbee and multiple readers
  • The indexing means that readers only need to download small subsets of the full dataset (if you index intelligently)
  • No way to do "projections" so keep in mind you're always downloading the full document to disk
  • Subset of find() API is implemented, no Map Reduce API, no $or/$and since it's difficult to optimize
  • You can only sort by indexed fields, otherwise there's no difference from loading all the data and sorting in memory
  • Fully open source under AGPL-3.0 and with mostly MIT dependencies.

Indexing considerations:

Indexes are super important to make your applications snappy and to reduce the overall CPU/Bandwidth/Storage usage of queries.

  • If you do a search by fields that aren't indexed, you'll end up downloading the full collection (this is potentially really slow)
  • The order of fields in the index matters, they're used to create an ordered key based on the values
  • If you want to sort by a field, make sure it's the first field in an index
  • You can have indexed fields before the sorted field if they are only used for $eq operations, this is due to the database's ability to turn them into a prefix to speed up the search.
  • If an index cannot be found to satisfy a sort the query will fail.
  • If you're using $gt/$lt/$gte/$lte in your query, they will perform best if the same considerations as the sort are applied.
  • If the fields in the index can be used to rule out a document as matching, then you can avoid loading more documents and doing fewer overall comparisons on data.
  • If your field is a unicode string which has 0x00 bytes in it, then the sorting might break due to the way BSON serializes unicode strings. Proceed with caution!