JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 975
  • Score
    100M100P100Q6562F
  • License MIT

TMLPD PI Extension - Parallel Multi-LLM Processing with Streaming, Caching, Cost Tracking, and Reliability Hardening

Package Exports

  • tmlpd-pi
  • tmlpd-pi/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (tmlpd-pi) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

TMLPD PI Extension

Parallel Multi-LLM Processing with Streaming, Caching, Cost Tracking, and Reliability

npm version License: MIT

Overview

TMLPD PI is an npm package that brings parallel multi-LLM execution to the PI agent system. It provides:

  • Parallel Execution - Execute prompts across multiple providers simultaneously
  • Response Caching - SHA-256 hashed cache with TTL and LRU eviction
  • Real-time Cost Tracking - Spending by provider, model, daily/monthly breakdowns
  • Reliability Hardening - Retry with jitter, circuit breakers, smart cooldown

Installation

npm install tmlpd-pi

Quick Start

import { createTMLPD } from "tmlpd-pi";

const tmlpd = createTMLPD({
  cache: { ttl_seconds: 3600 },
  budget: { daily_limit: 10.0 },
});

// Single execution with smart routing
const result = await tmlpd.execute("Explain quantum entanglement");

// Parallel execution across multiple models
const parallel = await tmlpd.executeParallel(prompt, [
  "openai/gpt-4o",
  "groq/llama-3.3-70b",
  "cerebras/llama-3.3-70b"
]);

// Check costs
const summary = tmlpd.getCostSummary();
console.log(`Total spent: $${summary.total_cost}`);

Features

Parallel Execution

Execute across multiple LLM providers simultaneously with automatic fallback:

const result = await tmlpd.executeParallel(
  "Compare Python and JavaScript",
  ["openai/gpt-4o", "groq/llama-3.3-70b", "cerebras/llama-3.3-70b"]
);
// Returns: { responses: [...], total_models: 3, successful_models: 3, total_cost: 0.002 }

Response Caching

Avoid redundant API calls with intelligent caching:

// Cache hit = instant response, $0 cost
const cached = tmlpd.execute("Same prompt again");

// Check cache stats
const stats = tmlpd.getCacheStats();
// { hits: 5, misses: 3, size: 8, hit_rate: 0.625 }

Cost Tracking

Real-time spending monitoring with budget alerts:

const summary = tmlpd.getCostSummary();
// {
//   total_cost: 0.547,
//   by_provider: { openai: 0.30, groq: 0.247 },
//   by_model: { gpt-4o: 0.30, llama-3.3-70b: 0.247 },
//   daily_costs: { '2024-01-15': 0.547 },
//   request_count: 42,
//   average_cost_per_request: 0.013
// }

Reliability

Built-in retry logic and circuit breakers:

const tmlpd = createTMLPD({
  retry: {
    max_attempts: 5,
    base_delay_ms: 1000,
    max_delay_ms: 60000,
    jitter: 0.3,
  }
});

Supported Providers

Provider Models Cost/1M tokens
OpenAI GPT-4o, GPT-4 Turbo, GPT-3.5 $2.50-10.00
Groq Llama 3.3 70B, Llama 3.1 8B $0.59-0.79
Cerebras Llama 3.3 70B $0.10
Mistral Mistral Large, Mistral Small $0.20-6.00
xAI Grok 2, Grok 2 Mini $0.20-8.00
Anthropic Claude 3.5, Claude 3 $3.00-75.00
Google Gemini 1.5 Pro, Gemini 1.5 Flash $0.075-5.00
ZAI GLM models ~$0.10-0.30

PI Integration

For PI agent integration, use the skill definition:

---
name: tmlpd
description: Parallel multi-LLM execution with streaming, caching, cost tracking, and reliability. Use when user wants parallel model execution, cost optimization across providers, or comparing multiple LLM responses simultaneously.
---

API Reference

createTMLPD(config)

Create a configured TMLPD instance.

interface TMLPDConfig {
  cache?: Partial<CacheConfig>;
  budget?: BudgetConfig;
  retry?: Partial<RetryConfig>;
  maxConcurrent?: number;
}

tmlpd.execute(prompt, model?, streaming?)

Execute with single or multiple models.

tmlpd.executeParallel(prompt, models?, streaming?)

Execute across multiple models in parallel.

tmlpd.getCostSummary()

Get comprehensive cost breakdown.

tmlpd.getCacheStats()

Get cache hit rate and size.

tmlpd.getProviderStatus()

Get all provider statuses.

License

MIT