JSPM

@vexaai/transcript-rendering

0.1.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 28
  • Score
    100M100P100Q70513F
  • License Apache-2.0

WebSocket transcript rendering: deduplication, speaker grouping, timestamp utilities

Package Exports

  • @vexaai/transcript-rendering

Readme

@vexaai/transcript-rendering

Zero-dependency WebSocket transcript rendering pipeline: deduplication, speaker grouping, and timestamp parsing.

Used by Vexa and DNA.

Install

npm install @vexaai/transcript-rendering

Usage

import { deduplicateSegments, groupSegments, parseUTCTimestamp } from '@vexaai/transcript-rendering'

// Remove duplicate/overlapping segments
const clean = deduplicateSegments(segments)

// Group consecutive segments by speaker (default)
const groups = groupSegments(clean)

// Group by custom key
const groups = groupSegments(clean, {
  getGroupKey: (seg) => seg.language ?? 'unknown',
  maxCharsPerGroup: 1024,
})

API

deduplicateSegments<T>(segments: T[]): T[]

Removes duplicate and overlapping segments using multiple strategies:

  • Adjacent duplicate detection
  • Full containment — removes segments fully contained in another
  • Expansion — merges partial segments into their completed versions
  • Tail-repeat filtering — removes segments that are just repeated endings

Segments are scored by: known speaker (+10), completion status (+5), duration (0-3). Higher-scoring segments win conflicts.

groupSegments<T>(segments: T[], options?): SegmentGroup<T>[]

Groups consecutive segments by a configurable key (default: speaker). Options:

Option Default Description
getGroupKey seg => seg.speaker Function returning the grouping key
maxCharsPerGroup 512 Max characters before splitting a group

parseUTCTimestamp(ts: string): number

Parses UTC timestamp strings (ISO 8601 or HH:MM:SS.mmm) into Unix epoch seconds.

Segment Interface

Your segments must have at minimum:

interface TranscriptSegment {
  text: string
  speaker?: string
  absolute_start_time: string
  absolute_end_time: string
  completed?: boolean
  start_time?: number
  end_time?: number
}

Extra fields pass through untouched via generics: deduplicateSegments<MySegment>(segments).

License

Apache-2.0