JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 97
  • Score
    100M100P100Q94629F

Offline, in-browser voice commands powered by EfficientWord-Net (ResNet-50 ArcFace).

Package Exports

  • mellon

Readme

What's the elvish word for "friend" ?

Mellon

Offline, fully in-browser hotword / wake-word detection powered by EfficientWord-Net (ResNet-50 ArcFace).

  • 100% offline — ONNX inference runs in the browser via WebAssembly; no server, no cloud.
  • Speaker-independent — the model generalises across voices out of the box.
  • Custom words — enroll any phrase with ≥ 3 audio samples.
  • TypeScript-ready — ships with full .d.ts declarations.

Table of contents

  1. Installation
  2. Quick start
  3. Enrolling words
  4. API reference
  5. [Science behind the lib][science-behind-the-lib]

Installation

npm install mellon

Assets setup

Assuming public/myfile maps to /myfile path from frontend perspective :

mkdir public/mellon-assets/
cp -r node_modules/mellon/dist/assets/* public/mellon-assets/

Quick start

import { Mellon } from 'mellon'

const hotWordDetection = new Mellon([
  {
    name: 'openDoors',
    triggers: [{ name: 'mellon', defaultRefPath: '/mellon-assets/mellon_ref.json' }],
    onMatch: () => console.log('opening the doors...')
  },
  {
    name: 'startEngine',
    triggers: [
      { name: 'start', defaultRefPath: '/mellon-assets/start_ref.json' },
      { name: 'go', defaultRefPath: '/mellon-assets/go_ref.json' }
    ],
    onMatch: (triggerNameMatched, confidence) => {
      console.log({ triggerNameMatched, confidence })
      console.log('starting engine...')
    }
  },
  {
    name: 'stopEngine',
    triggers: [
      { name: 'stop', defaultRefPath: '/mellon-assets/stop_ref.json' },
      { name: 'wait', defaultRefPath: '/mellon-assets/wait_ref.json' }
    ],
    onMatch: (triggerNameMatched, confidence) => {
      console.log({ triggerNameMatched, confidence })
      console.log('stopping engine...')
    }
  }
])

await hotWordDetection.start() // opens the mic and listens for all registered triggers

Enrolling custom words

import { Mellon } from 'mellon'

const hotwordDetection = new Mellon([{
  name: 'startEngine',
  triggers: [{ name: 'start' }],
  onMatch: (triggerNameMatched, confidence) => { console.log('starting engine...') }
}])


// 1. Create an enrollment session
const session = new EnrollmentSession('start')

// 2. Record at least 3 samples (1.5 s each)
await session.recordSample()
await session.recordSample()
await session.recordSample()

// 3. Generate reference embeddings
const ref = await session.generateRef()

// 4a. Use immediately in the running detector
hotwordDetection.addCustomWord(ref)
await hotwordDetection.start()

// 4b. Persist for future sessions
hotwordDetection.saveWord(ref)

API reference

Mellon

The easiest way to use the library. Wraps mic access, AudioWorklet wiring, and detector management into a single class.

class Mellon extends EventTarget {
  constructor(opts?: MellonOptions)
  readonly isInitialized: boolean
  readonly isRunning:     boolean

  init(onProgress?: (pct: number) => void): Promise<void>
  start(): Promise<void>
  stop(): void
  addCustomWord(refData: RefData): void

  // persistance of reference audios in localstorage
  loadWords(): RefData[] 
  saveWord(refData: RefData): void
  deleteWord(wordName: string): void
}

EnrollmentSession

Records audio samples from the mic (or uploaded files) and generates reference embeddings for a new custom word.

class EnrollmentSession {
  constructor(wordName: string)
  readonly wordName:    string
  readonly samples:     { audioBuffer: Float32Array; name: string }[]

  recordSample():            Promise<number>   // → 1-based sample index
  generateRef():             Promise<RefData>  // requires ≥ 3 samples
}

RefData shape

interface RefData {
  word_name:  string           // e.g. 'hello'
  model_type: 'resnet_50_arc'
  embeddings: number[][]       // N × 256 vectors
}

Compatible with the EfficientWord-Net _ref.json format — you can import reference files generated by the Python toolkit directly.


Science behind the lib

Check out this paper.

License

MIT