Odyssey Audio/Video SDK (MediaSoup + Web Audio)

This package exposes OdysseySpatialComms, a thin TypeScript client that glues together:

MediaSoup SFU for ultra-low-latency audio/video routing
Web Audio API for Apple-like spatial mixing via SpatialAudioManager
Socket telemetry (position + direction) so every browser hears/see everyone exactly where they are in the 3D world

It mirrors the production SDK used by Odyssey V2 and ships ready-to-drop into any Web UI (Vue, React, plain JS).

Feature Highlights

🔌 One class to rule it all – OdysseySpatialComms wires transports, producers, consumers, and room state.
🧭 Accurate pose propagation – updatePosition() streams listener pose to the SFU while participant-position-updated keeps the local store in sync.
🎧 Studio-grade spatial audio – each remote participant gets a dedicated Web Audio graph: denoiser → high-pass → low-pass → HRTF PannerNode → adaptive gain → master compressor.
🎥 Camera-ready streams – video tracks are exposed separately so UI layers can render muted <video> tags while audio stays inside Web Audio.
🔁 EventEmitter contract – subscribe to room-joined, consumer-created, participant-position-updated, etc., without touching Socket.IO directly.

Quick Start

import {
    OdysseySpatialComms,
    Direction,
    Position,
} from "@newgameplusinc/odyssey-audio-video-sdk-dev";

const sdk = new OdysseySpatialComms("https://mediasoup-server.example.com");

// 1) Join a room
await sdk.joinRoom({
    roomId: "demo-room",
    userId: "user-123",
    deviceId: "device-123",
    position: { x: 0, y: 0, z: 0 },
    direction: { x: 0, y: 1, z: 0 },
});

// 2) Produce local media
const stream = await navigator.mediaDevices.getUserMedia({ audio: true, video: true });
for (const track of stream.getTracks()) {
    await sdk.produceTrack(track);
}

// 3) Handle remote tracks
sdk.on("consumer-created", async ({ participant, track }) => {
    if (track.kind === "video") {
        attachVideo(track, participant.participantId);
    }
});

// 4) Keep spatial audio honest
sdk.updatePosition(currentPos, currentDir);
sdk.setListenerFromLSD(listenerPos, cameraPos, lookAtPos);

Audio Flow (Server ↔ Browser)

┌──────────────┐   update-position   ┌──────────────┐   pose + tracks   ┌──────────────────┐
│ Browser LSD  │ ──────────────────▶ │ MediaSoup SFU│ ────────────────▶ │ SDK Event Bus     │
│ (Unreal data)│                     │ + Socket.IO  │                   │ (EventManager)    │
└──────┬───────┘                     └──────┬───────┘                   └──────────┬────────┘
    │                                    │                                  track + pose
    │                                    │                                       ▼
    │                           ┌────────▼────────┐                      ┌──────────────────┐
    │ audio RTP                 │  consumer-created│                      │ SpatialAudioMgr   │
    └──────────────────────────▶│  setup per-user │◀──────────────────────│  (Web Audio API)  │
                    └────────┬────────┘                      │  - Denoiser       │
                          │                               │  - HP / LP        │
                          │                               │  - HRTF Panner    │
                          ▼                               │  - Gain + Comp    │
                     Web Audio Graph                         └──────────┬───────┘
                          │                                          │
                          ▼                                          ▼
                      Listener ears (Left/Right)                  System Output

Web Audio Algorithms

Coordinate normalization – Unreal sends centimeters; SpatialAudioManager auto-detects large values and converts to meters once.
Orientation math – setListenerFromLSD() builds forward/right/up vectors from camera/LookAt to keep the listener aligned with head movement.
Dynamic distance gain – updateSpatialAudio() measures distance from listener → source and applies a smooth rolloff curve, so distant avatars fade to silence.
Noise handling – the AudioWorklet denoiser now runs an adaptive multi-band gate (per W3C AudioWorklet guidance) before the high/low-pass filters, stripping constant HVAC/fan noise even when the speaker is close.

How Spatial Audio Is Built

Telemetry ingestion – each LSD packet is passed through setListenerFromLSD(listenerPos, cameraPos, lookAtPos) so the Web Audio listener matches the player’s real head/camera pose.
Per-participant node graph – when consumer-created yields a remote audio track, setupSpatialAudioForParticipant() spins up an isolated graph: MediaStreamSource → (optional) Denoiser Worklet → High-Pass → Low-Pass → Panner(HRTF) → Gain → Master Compressor.
Position + direction updates – every participant-position-updated event calls updateSpatialAudio(participantId, position, direction). The position feeds the panner’s XYZ, while the direction vector sets the source orientation so voices project forward relative to avatar facing.
Distance-aware gain – the manager stores the latest listener pose and computes the Euclidean distance to each remote participant on every update. A custom rolloff curve adjusts gain before the compressor, giving the “someone on my left / far away” perception without blowing out master levels.
Left/right rendering – because the panner uses panningModel = "HRTF", browsers feed the processed signal into the user’s audio hardware with head-related transfer functions, producing natural interaural time/intensity differences.

Video Flow (Capture ↔ Rendering)

┌──────────────┐   produceTrack   ┌──────────────┐   RTP   ┌──────────────┐
│ getUserMedia │ ───────────────▶ │ MediaSoup SDK│ ──────▶ │ MediaSoup SFU│
└──────┬───────┘                  │ (Odyssey)    │         └──────┬───────┘
    │                          └──────┬───────┘                │
    │                   consumer-created │ track                │
    ▼                                  ▼                       │
┌──────────────┐                   ┌──────────────┐                │
│ Vue/React UI │ ◀─────────────── │ SDK Event Bus │ ◀──────────────┘
│ (muted video │                   │ exposes media │
│  elements)   │                   │ tracks        │
└──────────────┘                   └──────────────┘

Core Classes

src/index.ts – OdysseySpatialComms (socket lifecycle, producers/consumers, event surface).
src/MediasoupManager.ts – transport helpers for produce/consume/resume.
src/SpatialAudioManager.ts – Web Audio orchestration (listener transforms, per-participant chains, denoiser, distance math).
src/EventManager.ts – lightweight EventEmitter used by the entire SDK.

Integration Checklist

Instantiate once per page/tab and keep it in a store (Vuex, Redux, Zustand, etc.).
Pipe LSD/Lap data from your rendering engine into updatePosition() + setListenerFromLSD() at ~10 Hz.
Render videos muted – never attach remote audio tracks straight to DOM; let SpatialAudioManager own playback.
Push avatar telemetry back to Unreal so remoteSpatialData can render minimaps/circles (see Odyssey V2 sendMediaSoupParticipantsToUnreal).
Monitor logs – browser console shows 🎧 SDK, 📍 SDK, and 🎚️ [Spatial Audio] statements for every critical hop.

Server Contract (Socket.IO events)

Event	Direction	Payload
`join-room`	client → server	`{roomId, userId, deviceId, position, direction}`
`room-joined`	server → client	`RoomJoinedData` (router caps, participants snapshot)
`update-position`	client → server	`{participantId, conferenceId, position, direction}`
`participant-position-updated`	server → client	`{participantId, position, direction, mediaState}`
`consumer-created`	server → client	`{participantId, track(kind), position, direction}`
`participant-media-state-updated`	server → client	`{participantId, mediaState}`

Development Tips

Run pnpm install && pnpm build inside mediasoup-sdk-test to publish a fresh build.
Use pnpm watch while iterating so TypeScript outputs live under dist/.
The SDK targets evergreen browsers; for Safari <16.4 you may need to polyfill AudioWorklets or disable the denoiser via new SpatialAudioManager({ denoiser: { enabled: false } }).

Have questions or want to extend the SDK? Start with SpatialAudioManager – that’s where most of the “real-world” behavior (distance feel, stereo cues, denoiser) lives.