JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 27
  • Score
    100M100P100Q79020F
  • License GPL-3.0-only

Agentic orchestration and semantic enforcement for structured LLM output, with native JSON repair and deterministic retries.

Package Exports

  • reforge-ai
  • reforge-ai/anthropic
  • reforge-ai/google
  • reforge-ai/groq
  • reforge-ai/openai-compatible
  • reforge-ai/openrouter
  • reforge-ai/together

Readme

Reforge

Agentic orchestration and semantic enforcement for structured LLM output.

npm version License: GPL v3 TypeScript


The Problem

LLMs are probabilistic and frequently output malformed JSON:

  • Markdown wrappers```json ... ``` blocks around the data
  • Trailing commas{"name": "Alice",}
  • Unquoted keys{name: "Alice"}
  • Single-quoted strings{'name': 'Alice'}
  • Truncated outputs{"items": [1, 2, 3 (hit max_tokens)
  • Escaped-quote anomalies{\"key\": \"value\"}

Network retries to providers (OpenAI, Anthropic, etc.) cost 5000ms+ and real money. Most of these failures are trivially fixable.

The Solution

reforge-ai is a zero-dependency TypeScript library that sits between the LLM output and your application:

  • Natively repairs syntactic JSON errors in sub-5ms local timings
  • Validates against your Zod schema with automatic type coercion
  • Optionally clamps semantic violations locally (too_small, too_big, enum drift) before network retries
  • Generates token-efficient retry prompts when repair isn't enough
  • Orchestrates tools, failovers, and retries with deterministic guardrails
  • Works everywhere: Node.js, Bun, Deno, Cloudflare Workers, Vercel Edge, Browsers

New Capability Highlights

  • Line-aware retry prompts: only relevant error lines are included (including multi-line contexts), reducing noisy retries.
  • Prompt customization: plug in your own retry prompt strategy with retryPromptStrategy.
  • Profiles + toggles: choose safe, standard, or aggressive guard profiles and override individual heuristics.
  • Built-in redaction: redact sensitive paths/patterns from retry contexts.
  • Debug artifacts: inspect extracted/repaired JSON and applied repair passes when needed.
  • Advanced forge orchestration: retry policies, deterministic tool loops, structured lifecycle events, and forgeWithFallback() provider failover.
  • Universal message schema: multi-modal content blocks + normalized tool history across OpenAI-compatible, Anthropic, and Gemini adapters.
  • Stream-safe output hooks: use onChunk for UI output while suppressing internal tool JSON chatter.

Timing Snapshot (Measured)

From timing-summary-2026-03-13.md:

  • guard() samples: 11
  • Min: 0.1442ms
  • Max: 2.5365ms
  • Average: 0.5536ms
  • Under 5ms: 11/11

Installation

npm install reforge-ai zod

To use provider adapters with forge(), also install the provider SDK:

# OpenAI / OpenRouter / Groq / Together / Ollama / etc.
npm install reforge-ai zod openai

# Anthropic
npm install reforge-ai zod @anthropic-ai/sdk

# Google Gemini
npm install reforge-ai zod @google/generative-ai

zod is a required peer dependency. Provider SDKs are optional peer dependencies — only install what you use.

Quick Start

import { z } from "zod";
import { guard } from "reforge-ai";

const UserSchema = z.object({
  name: z.string(),
  age: z.number(),
});

// Raw LLM output — markdown-wrapped with a trailing comma:
const raw = '```json\n{"name": "Alice", "age": 30,}\n```';

const result = guard(raw, UserSchema);

if (result.success) {
  console.log(result.data);       // { name: "Alice", age: 30 }
  console.log(result.isRepaired); // true
  console.log(result.telemetry);  // { durationMs: 0.55, status: "repaired_natively" }
} else {
  // Append result.retryPrompt to your LLM message array
  console.log(result.retryPrompt);
  console.log(result.errors);     // ZodIssue[]
}

guard() with Line-Aware Retry Context

const result = guard(raw, UserSchema, {
  profile: "standard",
  retryPrompt: {
    mode: "line-aware",
    contextRadius: 1,
    maxContextChars: 700,
    redactPaths: ["/user/ssn"],
  },
  debug: true,
});

if (!result.success) {
  console.log(result.retryPrompt); // includes only relevant lines in line-aware mode
  console.log(result.debug?.retryContextBlocks);
}

End-to-End with forge()

forge() wraps the entire flow: call your LLM → repair → validate → auto-retry.

import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const provider = openaiCompatible(new OpenAI(), "gpt-4o");

const Colors = z.array(
  z.object({
    name: z.string(),
    hex: z.string(),
  })
);

const result = await forge(
  provider,
  [{ role: "user", content: "List 3 colors with hex codes." }],
  Colors
);

if (result.success) {
  console.log(result.data);
  // → [{ name: "Red", hex: "#FF0000" }, ...]
  console.log(result.telemetry);
  // → { durationMs: 0.55, status: "repaired_natively", attempts: 1, totalDurationMs: 6132 }
}

Semantic Clamp (Local, No Network Retry)

import { z } from "zod";
import { guard } from "reforge-ai";

const Schema = z.object({
  age: z.number().min(0).max(100),
  tier: z.enum(["free", "pro", "enterprise"]),
});

const raw = JSON.stringify({ age: 154, tier: "vip" });

const result = guard(raw, Schema, {
  semanticResolution: { mode: "clamp" },
});

if (result.success) {
  console.log(result.data);     // { age: 100, tier: "free" }
  console.log(result.telemetry); // status: "coerced_locally", coercedPaths: ["/age", "/tier"]
}

Tool Loops + Stream Output

import { z } from "zod";
import { forge } from "reforge-ai";

const result = await forge(provider, messages, schema, {
  tools: {
    lookupCustomer: {
      description: "Lookup customer by id",
      schema: z.object({ id: z.string() }),
      execute: async ({ id }) => ({ id, plan: "enterprise" }),
    },
  },
  toolTimeoutMs: 4000,
  maxAgentIterations: 5,
  onChunk: (text) => uiStream.append(text),
});

forge() Advanced Controls

const result = await forge(provider, messages, Colors, {
  retryPolicy: {
    maxRetries: 4,
    shouldRetry: (failure, attempt) => attempt < 3 && failure.errors.length > 0,
    mutateProviderOptions: (attempt, base) => ({
      ...base,
      temperature: attempt === 1 ? 0.6 : 0.2,
    }),
  },
  onEvent: (event) => {
    if (event.kind === "retry_scheduled") {
      console.log(`Retrying: ${event.attempt} -> ${event.nextAttempt}`);
    }
  },
  guardOptions: {
    retryPrompt: { mode: "line-aware" },
  },
});

Provider Fallback Chain

import { forgeWithFallback } from "reforge-ai";

const result = await forgeWithFallback(
  [
    { provider: openaiCompatible(openaiClient, "gpt-4o"), maxAttempts: 2 },
    { provider: anthropic(anthropicClient, "claude-sonnet-4-20250514"), maxAttempts: 1 },
  ],
  messages,
  schema,
  {
    onProviderFallback: (from, to) => {
      console.log(`Fallback provider: ${from} -> ${to}`);
    },
  },
);

Provider Adapters

Adapter Import Covers
openaiCompatible() reforge-ai/openai-compatible OpenAI, OpenRouter, Groq, Together, Fireworks, Ollama, LM Studio, vLLM
anthropic() reforge-ai/anthropic Anthropic Claude
google() reforge-ai/google Google Gemini, Vertex AI
// OpenRouter — same adapter, different baseURL
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});
const provider = openaiCompatible(client, "anthropic/claude-sonnet-4-20250514");
// Anthropic
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";

const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");
// Google Gemini
import { google } from "reforge-ai/google";
import { GoogleGenerativeAI } from "@google/generative-ai";

const provider = google(
  new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!),
  "gemini-2.0-flash"
);
// Custom provider — implement a single method
import { forge, type ReforgeProvider } from "reforge-ai";

const myProvider: ReforgeProvider = {
  async call(messages, options) {
    const res = await fetch("https://my-llm-api.com/chat", {
      method: "POST",
      body: JSON.stringify({ messages, ...options }),
    });
    const data = await res.json();
    return data.text;
  },
};

How It Works

guard() runs a deterministic three-stage pipeline:

1. Dirty Parse (Native Repair)

The parser runs a sequence of heuristic passes to fix common LLM output issues:

Issue Before After
Markdown fences ```json\n{"a":1}\n``` {"a":1}
Conversational wrapping Here's the data: {"a":1} {"a":1}
Trailing commas {"a": 1,} {"a": 1}
Unquoted keys {name: "Alice"} {"name": "Alice"}
Single quotes {'key': 'val'} {"key": "val"}
Escaped quotes {\"key\": \"val\"} {"key": "val"}
Truncated output {"items": [1, 2 {"items": [1, 2]}

2. Schema Validation (with Coercion)

After parsing, the data is validated against your Zod schema. Reforge also attempts automatic coercion for common LLM type mismatches:

LLM Output Schema Expects Coerced To
"true" / "false" boolean true / false
"42", "3.14" number 42, 3.14
"null" nullable null

3. Retry Prompt Generation

If validation fails, Reforge generates a token-efficient prompt you can append to your LLM conversation:

Your previous response failed schema validation. Errors: [Path: /user/age, Expected: number, Received: string]. The schema is still in your context — return ONLY corrected valid JSON.

When the LLM returns something that can't be parsed as JSON at all, the raw text is echoed back if it's short enough to be the full picture (\u2264300 chars). For longer outputs, the snippet is omitted — a truncated fragment of the beginning shows nothing useful:

// Short output — full text echoed:
Your previous response could not be parsed as JSON. Got: `{name: Alice age: 30}`. The schema is still in your context — return ONLY valid JSON.

// Long output — snippet omitted:
Your previous response could not be parsed as JSON. The schema is still in your context — return ONLY valid JSON.

No network requests. No retries. Just a string you feed back.

API Reference

guard<T>(llmOutput: string, schema: T, options?: GuardOptions): GuardResult<z.infer<T>>

The main entry-point. Parses, repairs, validates, and returns a typed result.

Parameters:

Name Type Description
llmOutput string The raw string produced by an LLM
schema ZodTypeAny The Zod schema the output must conform to
options GuardOptions Optional profile/toggle config, line-aware retry mode, redaction, custom prompt strategy, and debug artifacts

Returns: GuardResult<T> — a discriminated union:

// Success
{
  success: true;
  data: T;                    // Validated & typed data
  telemetry: TelemetryData;   // { durationMs, status }
  isRepaired: boolean;        // true if the Dirty Parser fixed the input
}

// Failure
{
  success: false;
  retryPrompt: string;        // Token-efficient correction prompt
  errors: ZodIssue[];         // Zod validation issues
  telemetry: TelemetryData;   // { durationMs, status: "failed" }
}

Types

type TelemetryData = {
  durationMs: number;
  status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
  coercedPaths?: string[];
};

forge<T>(provider, messages, schema, options?): Promise<ForgeResult<z.infer<T>>>

End-to-end structured output: call LLM → guard() → auto-retry.

Parameters:

Name Type Description
provider ReforgeProvider An adapter wrapping your LLM SDK
messages Message[] Conversation messages to send
schema ZodTypeAny The Zod schema the output must conform to
options ForgeOptions Optional: maxRetries, retryPolicy, providerOptions, guardOptions, onRetry, onEvent

onRetry is called after each failed attempt that will be retried:

onRetry?: (attempt, failure) => {
  // attempt is 1-based
  // failure.errors are the zod issues from the failed guard() call
  // failure.retryPrompt is the corrective prompt used for the next attempt
}

Returns: Promise<ForgeResult<T>>:

// Success
{
  success: true;
  data: T;
  telemetry: ForgeTelemetry;
  isRepaired: boolean;
}

// Failure
{
  success: false;
  errors: ZodIssue[];
  retryPrompt: string;
  telemetry: ForgeTelemetry;
}

// ForgeTelemetry extends TelemetryData
interface ForgeTelemetry extends TelemetryData {
  attempts: number;        // Total LLM calls made
  totalDurationMs: number; // Wall-clock time for entire forge() call
  networkDurationMs: number;
  toolExecutionDurationMs: number;
  providerHops: Array<{
    providerId: string;
    attempt: number;
    succeeded: boolean;
    durationMs: number;
  }>;
  attemptDetails: Array<{
    attempt: number;
    durationMs: number;
    status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
  }>;
}

Message Model (v0.3+)

type Message = {
  role: "system" | "user" | "assistant" | "tool";
  content:
    | string
    | Array<
        | { type: "text"; text: string }
        | { type: "image_url"; image_url: { url: string; detail?: "auto" | "low" | "high" } }
      >;
  toolCalls?: Array<{ id: string; name: string; arguments: string }>;
  toolResponse?: {
    toolCallId: string;
    name: string;
    content: string | Array<{ type: "text"; text: string }>;
    isError?: boolean;
  };
};

Examples

OpenAI with forge()

import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const provider = openaiCompatible(new OpenAI(), "gpt-4o");

const RecipeSchema = z.object({
  title: z.string(),
  ingredients: z.array(z.string()),
  steps: z.array(z.string()),
});

const result = await forge(
  provider,
  [
    { role: "system", content: "Return JSON only." },
    { role: "user", content: "Give me a recipe for chocolate cake." },
  ],
  RecipeSchema,
  { maxRetries: 3, providerOptions: { temperature: 0.2 } }
);

if (result.success) {
  console.log(`Resolved in ${result.telemetry.attempts} attempt(s)`);
  console.log(result.data);
}

Anthropic with forge()

import { z } from "zod";
import { forge } from "reforge-ai";
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";

const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");

const SummarySchema = z.object({
  title: z.string(),
  summary: z.string(),
  tags: z.array(z.string()),
});

const result = await forge(
  provider,
  [{ role: "user", content: "Summarize: TypeScript 5.7 adds ..." }],
  SummarySchema
);

guard() Only (Manual Retry)

import OpenAI from "openai";
import { z } from "zod";
import { guard } from "reforge-ai";

const client = new OpenAI();

const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  tags: z.array(z.string()),
});

async function getProduct(prompt: string) {
  const messages: OpenAI.ChatCompletionMessageParam[] = [
    { role: "user", content: prompt },
  ];

  for (let attempt = 0; attempt < 3; attempt++) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
    });

    const raw = response.choices[0]?.message?.content ?? "";
    const result = guard(raw, ProductSchema);

    if (result.success) return result.data;

    messages.push({ role: "assistant", content: raw });
    messages.push({ role: "user", content: result.retryPrompt });
  }

  throw new Error("Failed after 3 attempts");
}

Edge Runtime (Next.js API Route)

// app/api/parse/route.ts
import { z } from "zod";
import { guard } from "reforge-ai";

export const runtime = "edge";

const PayloadSchema = z.object({
  action: z.string(),
  data: z.record(z.unknown()),
});

export async function POST(request: Request) {
  const body = await request.text();
  const result = guard(body, PayloadSchema);

  if (result.success) {
    return Response.json({ ok: true, data: result.data });
  }

  return Response.json(
    { ok: false, errors: result.errors },
    { status: 422 },
  );
}

Performance

Reforge is designed for < 5ms end-to-end on a 2KB input. The entire pipeline is:

  • Synchronous — no async, no network, no I/O
  • Pure — no global state mutation
  • O(n) — linear time relative to input length
  • Never throws — all error paths return typed result objects

Guarantees

  • Zero dependencies in corezod is a required peer dependency
  • Environment agnostic — no Node-specific APIs (fs, path, Buffer)
  • Tree-shakeable — ESM + CJS dual output via tsup
  • Strict TypeScript — full type safety with discriminated union results

Environment Compatibility

Runtime Status Notes
Node.js 16+ ✅ Supported CJS + ESM
Bun ✅ Supported Native ESM
Deno ✅ Supported Via npm: specifier
Cloudflare Workers ✅ Supported No Node APIs
Vercel Edge ✅ Supported Edge-compatible
Browser ✅ Supported ESM, tree-shakeable

Documentation

Full documentation, interactive demo, and integration guides are available at reforge-ai-97558.web.app.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

  • Setup instructions
  • Code standards
  • PR workflow
  • Test guidelines

Reporting Issues

  • Bug reports: Open an issue with the raw input, schema, and unexpected result.
  • Feature requests: Open an issue with the use case and proposed API.

Changelog

See CHANGELOG.md for a detailed history of changes.

License

GNU GPL v3