Package Exports
- composecache
Readme
ComposeCache
Adaptive compositional semantic caching for LLM APIs and RAG pipelines.
Why ComposeCache?
Existing semantic caches like GPTCache treat every query atomically. ComposeCache decomposes compositional queries (e.g., "Compare X and Y") into sub-queries, caches each independently, and enables partial hits - saving 50%+ on LLM API costs.
Quick Start
npm install composecache
npx composecache init --db postgres://localhost/myappimport { ComposeCache } from 'composecache';
const cache = new ComposeCache({
database: process.env.DATABASE_URL,
openaiApiKey: process.env.OPENAI_API_KEY
});
const response = await cache.complete({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: 'Compare France and Germany' }],
documents: retrievedDocs // Optional: for RAG
});
console.log(response.content); // The answer
console.log(response.cacheType); // 'exact' | 'semantic' | 'partial' | 'miss'
console.log(response.costSaved); // $ savedFeatures
- Compositional query decomposition (novel)
- Document-aware cache keys via MinHash
- Uncertainty-gated population (blocks hallucinations)
- Drop-in SDK for Node.js and Python
- Works with your own PostgreSQL database
Architecture
Query Processing Flow
flowchart TD
Q[Incoming query q] --> C{Classify: atomic or compositional?}
C -->|atomic| A[Compute SHA-256 key<br/>norm(q) || fD || theta]
C -->|compositional| D[Decompose into sub-queries<br/>s1 ... sk with deps E]
A --> P[Probe cache for each query<br/>exact hash, then semantic + doc]
D --> P
P --> H{All hits?}
H -->|yes| R[Return cached response<br/>or compose from subs]
H -->|no / partial| G[Generate missing sub-answers<br/>via RAG + LLM API]
R --> F[Compose final response]
G --> F
F --> U[Uncertainty gate: u <= umax?<br/>Write to cache if yes]System Architecture
flowchart TD
APP[Developer application<br/>Node.js / Python]
subgraph SDK[ComposeCache middleware - SDK / npm package]
direction LR
S1[1. Decompose] --> S2[2. Probe] --> S3[3. Resolve] --> S4[4. Compose] --> S5[5. Populate]
end
subgraph MODS[Core modules]
direction LR
E[Embedder<br/>all-MiniLM-L6-v2]
L[Decomposition LLM<br/>GPT-4o-mini]
M[MinHash + uncertainty<br/>estimator]
end
DB[Developer PostgreSQL + pgvector<br/>exact keys + semantic vectors]
API[Upstream LLM API<br/>OpenAI / Anthropic]
APP --> SDK
SDK --> MODS
SDK -->|cache read / write| DB
SDK -->|miss only| APIAPI Reference
[TODO: full typing reference]
Benchmarks
[TODO: HotpotQA results table]
License
MIT