JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 299
  • Score
    100M100P100Q88469F
  • License MIT

LOON (LLM-Optimized Object Notation) — Token-efficient serialization for LLM pipelines. JSON/CSV/XML/YAML/trees → LOON with up to ~78% token reduction, lossless round-trip.

Package Exports

  • loon-core
  • loon-core/package.json

Readme

LOON — LLM-Optimized Object Notation

npm version npm downloads node License: MIT

A compact wire format for data pipelines and LLM prompts.

LOON rewrites structured data (JSON / CSV / XML / YAML / trees) into a dense, fully reversible encoding that shrinks token counts without losing a single byte on the way back. Most formats lock you into one trade-off; LOON instead exposes three distinct modes — one built for service-to-service transport, one written to be read straight by a model, and one sized for tiny or uneven objects — so a job never carries compression overhead it has no use for.

The mental model is a boundary codec: hold JSON in your application, and switch to LOON only at the edge where every token shows up on the bill.

[!TIP] Across real retrieval runs (Gemini 3 Flash, o3-mini), LOON llm is the most token-efficient format at 100% accuracy — fewer tokens than TOON, JTON, JSON, JSON-compact, YAML and XML — and round-trips every benchmark dataset losslessly. See Benchmarks.

Table of Contents

Why LOON?

Context windows keep getting bigger, yet every token is still metered — and JSON spends them freely. Each brace, each quote, each field name repeated row after row lands on the invoice for every call. A uniform array of records is charged for its column labels once per element:

[
  { "id": 1, "name": "Alice", "dept": "Engineering", "salary": 120000, "active": true },
  { "id": 2, "name": "Bob",   "dept": "Sales",       "salary": 98000,  "active": false }
]

LOON llm mode states the schema a single time and then emits bare rows — no braces, no echoed keys, no quotes around values that aren't ambiguous — and a model can follow it cold, with no primer attached:

@T1[2]{id,name,dept,salary,active}
1,Alice,Engineering,120000,true
2,Bob,Sales,98000,false

When the target is storage or a service-to-service hop, full mode pushes harder (Base36 integers, sequences, dictionaries) and the decoder rebuilds the original exactly, byte for byte.

Key Features

  • Measured, Not Promised: in one fully billed retrieval run, LOON llm dropped total tokens 64% against JSON and 47% against TOON while answering at 100% accuracy (see Benchmarks).
  • One Mode Per Job: full covers transport and storage, llm is written for model reading, compact handles small or uneven data — nothing tries to be everything at once.
  • Reversible By Construction: the decoder is deterministic and reproduces the source JSON exactly; every mode round-trips the benchmark datasets 10/10.
  • Zero Setup For Models (llm): decimal numbers as-is, literal true/false/null, schema stated once. The model simply reads it — there is nothing for it to unpack.
  • Format Bridges Built In: a single API moves between JSON, CSV, XML, YAML, and nested trees in both directions.
  • Designed For Prompt Caches: LoonSession peels a reusable schema/spec prefix off the per-call rows, so repeat queries are billed mostly as cache hits.
  • Ships With A CLI: npx loon-core data.json converts, round-trips, and reports token savings under --stats.

The Three Modes

Mode What it's for Reads like
full Bulk transmission / .loon storage. System → system, decoder → decoder. Maximum compression; LLM-readability is not a design constraint. Dense protocol output
llm Direct LLM consumption. A model — cloud or local, reasoning or not — reads the payload with no spec or primer. Labeled CSV
compact Small / non-uniform datasets. Single deep objects, sparse rows, anything where a column schema can't amortize itself. key: value blocks

Plus:

  • compat — JSON-hybrid ({S,T,R} arrays). A 100%-valid-JSON escape hatch for environments that must parse with JSON.parse.
  • local — deprecated alias of llm (kept for back-compat).

Design rule. full is allowed every compression trick because its reader is a decoder. llm rejects every trick that would force the model to compute (Base36, sequences, dictionaries, suffix reattachment) and keeps only the tricks that de-duplicate structure — schema declared once, object arrays tabulated, nested objects grouped. The dividing line is: does this require execution to read?

When Not to Use LOON

  • Deep, irregular object trees: with no uniform arrays to fold into a table, compact mode still helps but the margin narrows — minified JSON can hold its own here.
  • Flat tables read by code, not a model: plain CSV edges it out on size. LOON's extra structure (schema, length markers, type guards) earns its keep with model reliability, not with a CSV reader.
  • You must call JSON.parse directly on the wire: reach for compat mode (which stays valid JSON) and take the smaller token cut that comes with it.
  • Latency-bound local inference: benchmark before you commit. Fewer tokens doesn't always translate to faster decode on quantized or on-device models.

Benchmarks

Three experiments: retrieval accuracy + token efficiency (real API calls), multi-tokenizer consistency, and round-trip fidelity. Datasets: synthetic (@faker-js/faker, seed 12345 — tabular, nested, analytics, event-logs, nested-config) plus a real one (top-100 GitHub repositories). Token efficiency = (accuracy% ÷ tokens) × 1000 (higher is better).

Retrieval accuracy & token efficiency

Each dataset is serialized in every format, embedded in an identical prompt, and queried with the same deterministic questions; answers are validated against a type-aware ground truth. Question sets are reduced (10–12 Q) to bound API cost.

gemini-3-flash-preview — 10 deterministic questions

Format Efficiency Accuracy Tokens
LOON llm 28.4 100% (10/10) 3,526
LOON local 28.1 100% (10/10) 3,565
TOON 27.2 100% (10/10) 3,672
JTON 24.1 100% (10/10) 4,155
JSON compact 20.4 100% (10/10) 4,906
LOON full 18.5 100% (10/10) 5,395
LOON compact 18.5 100% (10/10) 5,400
YAML 16.7 100% (10/10) 6,001
JSON 12.3 100% (10/10) 8,162
XML 10.7 100% (10/10) 9,358

o3-mini — 12 deterministic questions (DRY_RUN + FAST_FORMAT)

Format Efficiency Accuracy Tokens
LOON llm 25.7 100% (12/12) 3,892
LOON local 25.6 100% (12/12) 3,900
LOON compact 22.6 100% (12/12) 4,423
TOON 20.9 100% (12/12) 4,785
JTON 19.5 100% (12/12) 5,131
LOON full 16.3 91.7% (11/12) 5,611

[!TIP] LOON llm is the most token-efficient format at 100% accuracy — −4% tokens vs TOON on Gemini, −19% vs TOON on o3-mini. full minimizes input but its Base36 / sequence / dictionary cells force the model to decode mentally → more output tokens and an accuracy drop (91.7%). Use llm for LLM consumption, full for storage / transport.

Multi-tokenizer consistency

Percent token-count difference vs the o200k_base (GPT-4o / GPT-5) baseline on the GitHub dataset. Lower = the format compresses by a similar amount on that tokenizer, so a savings claim transfers across model families.

Format GPT-4 Claude Gemini Llama 3.2 Qwen3
JSON 1.3% 4.8% 26.9% 1.1% 13.9%
JSON compact 2.1% 7.7% 23.3% 2.0% 20.4%
YAML 1.4% 13.1% 20.5% 1.2% 16.9%
XML 1.7% 8.3% 23.7% 1.5% 12.7%
CSV 1.1% 6.9% 30.5% 1.0% 29.9%
TOON 1.5% 7.4% 27.7% 1.2% 25.3%
LOON llm 2.4% 9.9% 32.0% 1.9% 29.6%
LOON full 2.3% 10.7% 27.5% 1.9% 25.4%
LOON local 2.4% 9.9% 32.0% 1.9% 29.6%
LOON compact 2.2% 10.4% 32.0% 1.9% 28.2%
JTON 1.8% 8.4% 21.1% 1.6% 20.8%

Round-trip fidelity

Encode → decode with each format's standard parser → compare to the original. The comparator is type-coercion-aware (123 == "123"); structural diffs — missing keys, dropped nesting, length changes — count as loss.

Format Correct Lossy Decode error Fidelity
JSON 10/10 0 0 100%
JSON compact 10/10 0 0 100%
YAML 10/10 0 0 100%
XML 7/10 3 0 70%
CSV 0/10 1 9 0%
TOON 10/10 0 0 100%
LOON llm 10/10 0 0 100%
LOON full 10/10 0 0 100%
LOON local 10/10 0 0 100%
LOON compact 10/10 0 0 100%
JTON 0/10 10 0 0%

All four LOON modes round-trip every dataset losslessly — matching JSON / TOON, ahead of XML (lossy on nested), CSV (no standard parser), and JTON (lossy on all).

[!NOTE] The retrieval runs use reduced deterministic question sets (10–12 Q) to bound API cost. At this size every lossless format reaches 100% accuracy, so these results establish token efficiency at parity accuracy — not an accuracy ranking. Larger question sets and weaker models are needed to separate formats on comprehension.

Quick Start

npm install loon-core
import { Loon } from 'loon-core';

const loon = new Loon();

const data = [
  { id: 1, name: 'Alice', dept: 'Engineering', salary: 120000, active: true },
  { id: 2, name: 'Bob',   dept: 'Sales',       salary: 98000,  active: false },
];

// LLM consumption (default for prompts)
loon.toLOON(data, { mode: 'llm' });

// Bulk transmission / storage (not for raw LLM reading)
loon.toLOON(data, { mode: 'full' });

// Small / irregular data
loon.toLOON(data, { mode: 'compact' });

// Decode (auto-detects the mode)
loon.fromLOON(encoded);

If mode is omitted it is auto-selected: compact for empty or non-uniform input, micro for 1–4 rows, full for ≥ 5 uniform rows. Override with { mode: 'llm' } when the target is a model.

CLI

No installation required — use it instantly with npx:

npx loon-core data.json

Or install globally and use the loon command:

npm install -g loon-core

Options

Flag Alias Description
--from <fmt> -f Input format: json csv xml yaml loon (auto-detect)
--to <fmt> -t Output format: loon json csv xml yaml (default: loon)
--mode <mode> -m Encoding mode: full llm compact (default: auto)
--output <file> -o Write output to file instead of stdout
--indent <n> -i JSON output indentation (default: 2)
--stats -s Show token estimate and savings after encoding
--verbose -v Print full stack traces on errors
--help -h Show help

Token statistics

Pass --stats to see a token estimate before and after encoding. Uses a chars/4 heuristic — fast, no API key required.

✔ data.json → output.loon
ℹ Token estimate: ~15,145 (json) → ~8,745 (loon)
✔ Saved ~6,400 tokens (-42%) [strong]

Output is colour-coded in TTY terminals and plain text when piped.

Examples

# JSON → LOON (auto mode)
loon data.json

# JSON → LOON llm mode + token stats
loon data.json -m llm --stats

# JSON → LOON full compression, write to file
loon data.json -m full -o output.loon

# CSV → LOON
loon data.csv -f csv -m llm

# LOON → JSON
loon data.loon -t json

# Pipe from stdin
echo '[{"id":1,"name":"Ada","role":"dev"}]' | loon

# Round-trip: JSON → LOON → JSON
cat data.json | loon | loon -f loon -t json

Wire Format

full mode — maximum compression

The format that lives in a .loon file or rides between two services.

S:@T1[N]=[col:type,...]      schema: row count + columns with type codes
A:fullName,...               column aliases (only if names were abbreviated)
DC:col,...                   integer columns stored decimal, not Base36
C:col=value                  constant column (omitted from every row)
Q:col=start,step             integer arithmetic sequence
QF:col=start,step            float arithmetic sequence
QS:col=start,step,prefix     string sequence (prefix + counter)
FP:d=col,...                 fixed-point: row token ÷ 10^d = value
X:col=suffix                 common suffix stripped, re-appended on decode
D:col={tok:val,...}          semantic dictionary (token → value)
D:__global__={tok:prefix}    shared prefix dictionary (backs `$tok` cells)
D:defaults=col=val,...       per-column default; `~` in a row means "use it"
DL:col=firstValue            delta encoding (row tokens are signed deltas)
NM:col=mean,std,sigmaT,mT    z-score normalization (LOSSY)
LY:NM                        marks payload contains lossy NORM columns
AS:col=k1,k2,...             uniform object-array sub-schema (see below)
@T1:                         start of the data block
F:csv                        rows are comma-delimited
<data rows>

full is not meant to be read by an LLM directly. If you must, prepend getSpec(encoded) to the prompt — and even then expect output-token costs to be higher than llm mode, because the model has to mentally decode every Base36 / sequence / dictionary cell.

llm mode — self-evident, LLM-readable

One header line. Plain decimal rows. JSON-style literals.

@T1[N]{id,name,email,dept,salary,active}
C:status=active                       (optional, when a constant column exists)
AS:items=sku,name,qty                 (optional, for object-array columns)
1,Alice,alice@x.com,Engineering,120000,true
2,Bob,bob@x.com,Sales,98000,false
3,Carol,carol@x.com,Marketing,null,true

Rules the model can apply at sight:

  • Numbers bare: 120000. Decoded as Number(token).
  • Booleans literal: true / false.
  • Null literal: null (same single BPE token JSON uses).
  • Strings bare when unambiguous: Alice.
  • Strings that look like numbers / bools / null are quote-wrapped: "123", "true", "null". The decoder strips the quotes and keeps the string as a string.
  • Absent key in this row (sparse / non-uniform): +.
  • Inline object (col:o): the cell is plain JSON ({...}) — read as-is.
  • #ex row1 → … (when present): row 1 already decoded, column by column. A #-prefixed line is not data — the decoder skips it. It is emitted only for non-trivial schemas (wide, or with extracted constants / :o / :a columns) to anchor the positional mapping on the first read, cutting the reasoning a model would otherwise spend aligning values to columns.

No :type codes in the header — the model infers types from cell shape. That alone saves ~2 tokens per column (the :a array and :o inline-object tags are the two exceptions — type inference cannot recover a pipe-joined array or distinguish a JSON-object cell from a string). No DC: / @T1: / F:csv scaffolding either; the @…{…} line is its own start marker.

Deep, heterogeneous arrays — sparse-subtree folding

Flattening every nested object to dot-notation explodes a heterogeneous array (e.g. GitHub events: PushEvent, ForkEvent, …) into a giant column union where most columns are absent (+) on most rows. That bloats the header and makes a model transcribe the structure in its answer (large output cost). LOON folds a nested subtree back to a single inline :o object column when it is both deep (nested past the top level) and sparse (present in < 50% of rows). Dense or shallow structure still expands to dot-notation columns, where schema-once dedup pays off. On github.json (30 events) this cut the schema from 100+ columns to 32 with lossless round-trip.

compact mode — small or irregular data

id: 1
name: Alice
tags[3]: a,b,c                    scalar array, length 3
items[2]{sku,qty}: A,1;B,2        uniform object array
---
id: 2
...

Single deep objects (configs) use an indented hierarchy instead of repeated dot-notation keys.

Row tokens (shared)

Token Meaning
^ null (full mode) — llm mode writes the literal null
~ use the column default (full only)
+ key absent from this row — omit it (sparse / non-uniform schema)
!value raw literal string (bypass the dictionary; full only)
. a row that is entirely defaults
*N[...] run-length: repeat the bracketed row N times (full only)

AS: — uniform object-array sub-schema

A column holding an array of same-shape objects (items: [{sku,name,qty}, …]) would otherwise be inline JSON with the keys repeated on every element. AS: declares the shared shape once; each cell carries values only:

AS:items=sku,name,qty
cell:  A|Mouse|1;B|Cable|2   →   [{sku:A,name:Mouse,qty:1},{sku:B,name:Cable,qty:2}]

Fields within an object are |-separated; objects are ;-separated.

Type codes (full mode)

i integer · f float · s string · b boolean · a array · o object

llm mode omits the codes — types are inferred from cell shape.

getSpec() — when full must talk to an LLM

full is built for parsers. If you need an LLM to read a full payload (for example: you stored data in .loon, you now want a model to query it), getSpec(encoded) returns a minimal decode spec (200–600 tokens) covering only the headers that this specific payload actually uses, plus a worked walkthrough of row 0.

import { getSpec } from 'loon-core';

const encoded = loon.toLOON(data, { mode: 'full' });
const spec = getSpec(encoded);   // { text, sections, estimatedTokens }
// prepend spec.text to the prompt; the model can now decode `full`.

llm mode does not need a spec — that is the whole point of it.

Sessions & Context Caching

For repeated calls against the same data shape, LoonSession separates the cacheable prompt prefix from the per-call data block. LLM providers cache an identical prefix and bill it at a fraction of the normal rate; put the spec and the schema there and the per-call cost shrinks to just the rows.

import { LoonSession } from 'loon-core';

const s = new LoonSession();
s.init(firstBatch, { mode: 'full' });

// System prompt (mark it for prompt caching): s.primer   ← getSpec() + schema
// User message 1:                             s.dataBlock
for (const batch of moreBatches) {
  send(s.encodeRows(batch).dataBlock);          // only the rows
}

splitLoon(encoded) exposes the raw { schema, dataBlock } split for custom integrations.

Prompt caching reduces input cost. It does not touch output tokens. A model that has to reason hard about a format still pays full price on output. That is why llm mode beats full + cached spec for direct LLM consumption: llm's output token cost is small because the format requires no reasoning to read.

API

Method Behavior
toLOON(data, opts?) / encode JSON array → LOON
fromLOON(loon) / decode LOON → JSON array
fromCSV / fromXML / fromYAML other formats → LOON
toCSV / toXML / toYAML LOON → other formats
fromTree(tree, opts?) tree (nested objects) → LOON (TREE: header)
toTree(loon) LOON → tree
chunk(data, opts) split into context-window-sized LOON chunks
encodeStream / fromLOONStream async streaming codec
getSpec(loon) minimal decode spec for a payload
LoonSession multi-call session: primer, dataBlock, encodeRows, decode
splitLoon(loon) { schema, dataBlock, full }
validateDecode(loon, rows) post-decode structural check
repairHint(loon, errors) minimal retry prompt for an LLM that mis-decoded
reset() clear per-instance schema state

Options (LoonOptions)

Option Effect
mode force full / llm / compact / compat
fields column projection
maxDecimals trim float precision before encoding
tableId override the default schema id (T1)
outFile write encoded output to a file (Node)
checkpointEvery emit a #CKP: schema checkpoint every N rows (full)
primaryCols promote columns to the front + force decimal
norm z-score normalize float columns (lossy; full only)

Architecture

Input (JSON / CSV / XML / YAML / tree)
        │
   Mode Selector ──────────────┐
        │                      │ (auto-pick when mode omitted)
   ┌────┴─────┬──────────┐     │
   ▼          ▼          ▼     ▼
 full       llm      compact  compat
   │          │          │     │
   └── Adaptive ─┘        │     │
   pipeline               │     │
        │                 │     │
        └─────────────────┴─────┘
                  │
            LOON output
Module Path Role
Public API src/index.ts Loon facade, mode routing
Adaptive encoder src/encoder/adaptive/ analyzerheaderrows
Adaptive decoder src/decoder/adaptive/ header-parserrow-reconstructor
Compact codec src/encoder/compact.ts, src/decoder/compact.ts key: value + indent
Adaptive engine src/compression/adaptive.ts cell compress/decompress, dictionaries, AS: tabular
Mode selector src/compression/mode-selector.ts dataset-shape heuristics
State manager src/state/state-manager.ts per-schema context + reverse-dict cache
Spec generator src/utils/get-spec.ts getSpec() minimal decode spec
Session src/session.ts LoonSession, splitLoon
Codecs src/codecs/ CSV / XML / YAML / tree bridges

full and llm share one analysis pipeline. The analyzer gates off every compute-requiring primitive (Base36, sequences, dictionaries, defaults, suffixes, fixed-point, delta, NORM, RLE, anchor rows) when the mode is llm. What survives is structural-only: the schema, C: constants, and AS: sub-schemas. The header emitter also drops :type codes and replaces ^ (null sentinel) with the literal null for llm mode.

License

MIT © LOON Thesis Team