Package Exports
- @nfinitmonkeys/cortex-sdk
Readme
@cortex/sdk — TypeScript
The official TypeScript / JavaScript client for Cortex — InfiniteMonkeys' secure LLM gateway. Chat, vision, embeddings, speech, RAG, research — fully typed, streaming-ready, works in Node 18+ and modern browsers.
npm install @cortex/sdkimport { Cortex } from "@cortex/sdk";
const cortex = new Cortex({ apiKey: "sk-cortex-..." });
const r = await cortex.chat("Hello, world!");
console.log(r.text);That's it. Read on for every feature.
Table of contents
- Setup
- Chat
- Streaming chat
- Response style presets
- JSON / structured output
- Embeddings
- Speech-to-text
- Text-to-speech
- Document extraction (Iris)
- OCR + form templates
- Deep Research
- RAG collections
- Error handling
- Checking service health
- Configuration
Setup
Install from npm:
npm install @nfinitmonkeys/cortex-sdkimport { Cortex } from "@nfinitmonkeys/cortex-sdk";
// Picks up CORTEX_API_KEY from the environment (Node)
const cortex = new Cortex();
// Or pass it:
const cortex2 = new Cortex({ apiKey: "sk-cortex-..." });Works in Node 18+, Bun, Deno, and modern browsers (Edge, Chrome, Safari). In the browser you'll need a server-side proxy unless your key is a public one — never ship a user-scoped key to the client.
Migrating from v1.x
v2 rewrote the client around a flatter, friendlier API. Your old import
keeps working — CortexClient is now an alias for Cortex — but the
method shape changed. One-time rewrite, no polyfills.
| v1 (resource-group style) | v2 (flat style) |
|---|---|
client.chat.completions.create({ model: "default", messages: [...] }) |
cortex.chat("Hi") |
.create({ ..., stream: true }) → iterate raw SSE chunks |
for await (const c of cortex.chatStream("Hi")) { ... } — just the content |
client.embeddings.create({ input, model }) |
await cortex.embed(input) |
client.audio.transcriptions.create({ file }) |
await cortex.transcribe(audio) |
client.audio.speech.create({ input, voice }) |
await cortex.speak("text", { voice: "james" }) |
client.iris.extract({ file }) |
await cortex.extract(pdf) |
New in v2:
- Style presets (
{ style: "concise" }/"markdown"/ etc.) - Typed errors —
CortexAuthError,CortexRateLimitError,CortexUpstreamError, … ChatResponse.parseJson<T>()— strips markdown fences from LLM JSON- Auto-retry on 429/5xx with
Retry-Afterhonoring - RAG Collections (
cortex.collections.ask(...)) - Deep Research (
cortex.research.wait(...)) - Static
Cortex.status()— no API key needed AbortSignalsupport on every method
No timeline to remove the CortexClient alias — it stays forever.
Chat
Simplest case:
const r = await cortex.chat("What is a vector database?");
console.log(r.text);Multi-turn:
const r = await cortex.chat([
{ role: "system", content: "You are a concise assistant." },
{ role: "user", content: "Name three NoSQL databases." },
]);Per-call pool routing:
const r = await cortex.chat(
"Extract names from: Alice, Bob, Carol",
{ pool: "cortex-extract" }
);Streaming chat
for await (const chunk of cortex.chatStream("Write a limerick about otters")) {
process.stdout.write(chunk);
}Works with AbortSignal to cancel mid-stream:
const ctrl = new AbortController();
setTimeout(() => ctrl.abort(), 5_000);
for await (const chunk of cortex.chatStream("Long story please", { signal: ctrl.signal })) {
process.stdout.write(chunk);
}Response style presets
Pick a style instead of writing a system prompt every time:
await cortex.chat("Summarise RAG.", { style: "concise" });
await cortex.chat("Show a Python list", { style: "code-only" });
await cortex.chat("Compare Redis & Mongo",{ style: "markdown" });
await cortex.chat("Deploy nginx", { style: "technical" });
await cortex.chat("Pick a name", { style: "chat" });Combine with your own system prompt:
await cortex.chat("Find the bug", {
style: "technical",
system: "You are a staff Python engineer reviewing a PR.",
});JSON / structured output
Schema-enforced — no retries, no regex parsing:
const r = await cortex.chat("John Doe, 42, lives in Boston.", {
responseFormat: {
type: "json_schema",
json_schema: {
name: "person",
schema: {
type: "object",
properties: {
name: { type: "string" },
age: { type: "integer" },
city: { type: "string" },
},
required: ["name", "age", "city"],
},
},
},
});
const data = JSON.parse(r.text); // always validEmbeddings
const v = await cortex.embed("cortex is a gateway"); // number[] (1024)
const vs = await cortex.embed(["a", "b", "c"]); // number[][]Speech-to-text (transcription)
Pass a Blob, File, ArrayBuffer, or Uint8Array:
import { readFileSync } from "node:fs";
const audio = readFileSync("meeting.wav");
const t = await cortex.transcribe(audio, { filename: "meeting.wav" });
console.log(t.text);With speaker diarization:
const t = await cortex.transcribe(audio, { filename: "meeting.wav", diarize: true });In the browser with a <input type="file">:
const input = document.querySelector<HTMLInputElement>("#audio")!;
const file = input.files![0];
const t = await cortex.transcribe(file, { filename: file.name });Text-to-speech
const audio = await cortex.speak("Welcome to Cortex."); // ArrayBuffer
writeFileSync("hello.wav", new Uint8Array(audio));Expressive (auto-inserts laughs, sighs, pauses):
const audio = await cortex.speak(
"Wow! That's amazing. I'm so glad you came.",
{ expressive: true }
);Voice selection:
await cortex.speak("Hello", { voice: "james" });Document extraction (Iris)
import { readFileSync } from "node:fs";
const pdf = readFileSync("invoice.pdf");
const inv = await cortex.extract(pdf, { filename: "invoice.pdf", type: "invoice" });
console.log(inv.result);Custom schema:
const medical = await cortex.extract(pdf, {
filename: "discharge.pdf",
schema: {
patient_name: "string",
diagnosis_codes: "string[]",
discharge_date: "date",
},
});Submit a correction (training signal):
await cortex.correctExtraction(inv.id, [
{ field_name: "total", original_value: "127.43", corrected_value: "1274.30" },
]);OCR + form templates
Raw OCR:
const ocr = await cortex.ocr(imgBuffer, { filename: "scan.png" });
console.log(ocr.text);Structured fields via a template (~200 ms):
const fields = await cortex.ocr(claimPdf, { filename: "claim.pdf", template: "cms1500" });
console.log(fields.fields?.patient_name);Built-in templates: cms1500, ub04, superbill, eob.
Deep Research
const job = await cortex.research.submit(
"What do you know about Acme Medical Group?",
{ type: "company_enrichment", depth: "quick" }
);
// Block until done (polls automatically)
const result = await cortex.research.wait(job.job_id, { timeoutMs: 600_000 });
console.log(result.result);RAG collections
await cortex.collections.create("company-kb");
await cortex.collections.upload("company-kb", handbookBytes, { filename: "handbook.pdf" });
const a = await cortex.collections.ask("company-kb", "What's our PTO policy?");
console.log(a.answer);
for (const s of a.sources) {
console.log(` · ${s.filename} (${Math.round(Number(s.score) * 100)}% match)`);
}Just search:
const hits = await cortex.collections.search("company-kb", "parental leave", { topK: 3 });Error handling
Every error extends CortexError:
import { Cortex, CortexAuthError, CortexRateLimitError, CortexError } from "@cortex/sdk";
try {
await cortex.chat("hello");
} catch (err) {
if (err instanceof CortexAuthError) { /* key invalid or revoked */ }
else if (err instanceof CortexRateLimitError) { /* back off */ }
else if (err instanceof CortexError) {
console.error(`Cortex failed: ${err.statusCode} ${err.detail}`);
} else {
throw err;
}
}429 / 502 / 503 / 504 are retried automatically with exponential backoff (default 2 retries) — you only see them as errors when retries are exhausted.
Checking service health
No API key needed:
import { Cortex } from "@cortex/sdk";
const status = await Cortex.status();
console.log(`Cortex is ${status.overall}`);
for (const pool of status.pools) {
console.log(` ${pool.pool}: ${pool.status}`);
}Or just open status.nfinitmonkeys.com in a browser.
Configuration
| Option | Default | What |
|---|---|---|
apiKey |
process.env.CORTEX_API_KEY |
Your API key |
baseUrl |
https://cortexapi.nfinitmonkeys.com |
Override for self-hosted |
timeoutMs |
120_000 |
Default request timeout |
retries |
2 |
Retries on 429/5xx |
defaultPool |
undefined |
Default X-Cortex-Pool header |
fetch |
globalThis.fetch |
Custom fetch (proxy/testing) |
const cortex = new Cortex({
timeoutMs: 300_000,
retries: 5,
defaultPool: "cortex-extract",
});Bundle size
- ESM: ~19 KB (minified, unzipped)
- CJS: ~21 KB
- Zero runtime dependencies — relies on native
fetch,FormData,Blob,AbortController
Support
- Docs: this README plus JSDoc on every method
- Status: status.nfinitmonkeys.com
- Issues: file on GitHub
Happy building 🦧