Package Exports
- loon-core
- loon-core/package.json
Readme
LOON — LLM-Optimized Object Notation
A compact wire format for data pipelines and LLM prompts.
LOON rewrites structured data (JSON / CSV / XML / YAML / trees) into a dense, fully reversible encoding that shrinks token counts without losing a single byte on the way back. Most formats lock you into one trade-off; LOON instead exposes three distinct modes — one built for service-to-service transport, one written to be read straight by a model, and one sized for tiny or uneven objects — so a job never carries compression overhead it has no use for.
The mental model is a boundary codec: hold JSON in your application, and switch to LOON only at the edge where every token shows up on the bill.
[!TIP] Across real retrieval runs (Gemini 3 Flash, o3-mini), LOON
llmis the most token-efficient format at 100% accuracy — fewer tokens than TOON, JTON, JSON, JSON-compact, YAML and XML — and round-trips every benchmark dataset losslessly. See Benchmarks.
Table of Contents
- Why LOON?
- Key Features
- The Three Modes
- When Not to Use LOON
- Benchmarks
- Quick Start
- CLI
- Wire Format
getSpec()— whenfullmust talk to an LLM- Sessions & Context Caching
- API
- Architecture
- License
Why LOON?
Context windows keep getting bigger, yet every token is still metered — and JSON spends them freely. Each brace, each quote, each field name repeated row after row lands on the invoice for every call. A uniform array of records is charged for its column labels once per element:
[
{ "id": 1, "name": "Alice", "dept": "Engineering", "salary": 120000, "active": true },
{ "id": 2, "name": "Bob", "dept": "Sales", "salary": 98000, "active": false }
]LOON llm mode states the schema a single time and then emits bare rows —
no braces, no echoed keys, no quotes around values that aren't ambiguous — and
a model can follow it cold, with no primer attached:
@T1[2]{id,name,dept,salary,active}
1,Alice,Engineering,120000,true
2,Bob,Sales,98000,falseWhen the target is storage or a service-to-service hop, full mode pushes
harder (Base36 integers, sequences, dictionaries) and the decoder rebuilds the
original exactly, byte for byte.
Key Features
- Measured, Not Promised: in one fully billed retrieval run, LOON
llmdropped total tokens 64% against JSON and 47% against TOON while answering at 100% accuracy (see Benchmarks). - One Mode Per Job:
fullcovers transport and storage,llmis written for model reading,compacthandles small or uneven data — nothing tries to be everything at once. - Reversible By Construction: the decoder is deterministic and reproduces the source JSON exactly; every mode round-trips the benchmark datasets 10/10.
- Zero Setup For Models (
llm): decimal numbers as-is, literaltrue/false/null, schema stated once. The model simply reads it — there is nothing for it to unpack. - Format Bridges Built In: a single API moves between JSON, CSV, XML, YAML, and nested trees in both directions.
- Designed For Prompt Caches:
LoonSessionpeels a reusable schema/spec prefix off the per-call rows, so repeat queries are billed mostly as cache hits. - Ships With A CLI:
npx loon-core data.jsonconverts, round-trips, and reports token savings under--stats.
The Three Modes
| Mode | What it's for | Reads like |
|---|---|---|
full |
Bulk transmission / .loon storage. System → system, decoder → decoder. Maximum compression; LLM-readability is not a design constraint. |
Dense protocol output |
llm |
Direct LLM consumption. A model — cloud or local, reasoning or not — reads the payload with no spec or primer. | Labeled CSV |
compact |
Small / non-uniform datasets. Single deep objects, sparse rows, anything where a column schema can't amortize itself. | key: value blocks |
Plus:
compat— JSON-hybrid ({S,T,R}arrays). A 100%-valid-JSON escape hatch for environments that must parse withJSON.parse.local— deprecated alias ofllm(kept for back-compat).
Design rule.
fullis allowed every compression trick because its reader is a decoder.llmrejects every trick that would force the model to compute (Base36, sequences, dictionaries, suffix reattachment) and keeps only the tricks that de-duplicate structure — schema declared once, object arrays tabulated, nested objects grouped. The dividing line is: does this require execution to read?
When Not to Use LOON
- Deep, irregular object trees: with no uniform arrays to fold into a
table,
compactmode still helps but the margin narrows — minified JSON can hold its own here. - Flat tables read by code, not a model: plain CSV edges it out on size. LOON's extra structure (schema, length markers, type guards) earns its keep with model reliability, not with a CSV reader.
- You must call
JSON.parsedirectly on the wire: reach forcompatmode (which stays valid JSON) and take the smaller token cut that comes with it. - Latency-bound local inference: benchmark before you commit. Fewer tokens doesn't always translate to faster decode on quantized or on-device models.
Benchmarks
Three experiments: retrieval accuracy + token efficiency (real API calls),
multi-tokenizer consistency, and round-trip fidelity. Datasets:
synthetic (@faker-js/faker, seed 12345 — tabular, nested, analytics,
event-logs, nested-config) plus a real one (top-100 GitHub repositories). Token
efficiency = (accuracy% ÷ tokens) × 1000 (higher is better).
Retrieval accuracy & token efficiency
Each dataset is serialized in every format, embedded in an identical prompt, and queried with the same deterministic questions; answers are validated against a type-aware ground truth. Question sets are reduced (10–12 Q) to bound API cost.
gemini-3-flash-preview — 10 deterministic questions
| Format | Efficiency | Accuracy | Tokens |
|---|---|---|---|
LOON llm |
28.4 | 100% (10/10) | 3,526 |
LOON local |
28.1 | 100% (10/10) | 3,565 |
| TOON | 27.2 | 100% (10/10) | 3,672 |
| JTON | 24.1 | 100% (10/10) | 4,155 |
| JSON compact | 20.4 | 100% (10/10) | 4,906 |
LOON full |
18.5 | 100% (10/10) | 5,395 |
LOON compact |
18.5 | 100% (10/10) | 5,400 |
| YAML | 16.7 | 100% (10/10) | 6,001 |
| JSON | 12.3 | 100% (10/10) | 8,162 |
| XML | 10.7 | 100% (10/10) | 9,358 |
o3-mini — 12 deterministic questions (DRY_RUN + FAST_FORMAT)
| Format | Efficiency | Accuracy | Tokens |
|---|---|---|---|
LOON llm |
25.7 | 100% (12/12) | 3,892 |
LOON local |
25.6 | 100% (12/12) | 3,900 |
LOON compact |
22.6 | 100% (12/12) | 4,423 |
| TOON | 20.9 | 100% (12/12) | 4,785 |
| JTON | 19.5 | 100% (12/12) | 5,131 |
LOON full |
16.3 | 91.7% (11/12) | 5,611 |
[!TIP] LOON
llmis the most token-efficient format at 100% accuracy — −4% tokens vs TOON on Gemini, −19% vs TOON on o3-mini.fullminimizes input but its Base36 / sequence / dictionary cells force the model to decode mentally → more output tokens and an accuracy drop (91.7%). Usellmfor LLM consumption,fullfor storage / transport.
Multi-tokenizer consistency
Percent token-count difference vs the o200k_base (GPT-4o / GPT-5) baseline on
the GitHub dataset. Lower = the format compresses by a similar amount on that
tokenizer, so a savings claim transfers across model families.
| Format | GPT-4 | Claude | Gemini | Llama 3.2 | Qwen3 |
|---|---|---|---|---|---|
| JSON | 1.3% | 4.8% | 26.9% | 1.1% | 13.9% |
| JSON compact | 2.1% | 7.7% | 23.3% | 2.0% | 20.4% |
| YAML | 1.4% | 13.1% | 20.5% | 1.2% | 16.9% |
| XML | 1.7% | 8.3% | 23.7% | 1.5% | 12.7% |
| CSV | 1.1% | 6.9% | 30.5% | 1.0% | 29.9% |
| TOON | 1.5% | 7.4% | 27.7% | 1.2% | 25.3% |
LOON llm |
2.4% | 9.9% | 32.0% | 1.9% | 29.6% |
LOON full |
2.3% | 10.7% | 27.5% | 1.9% | 25.4% |
LOON local |
2.4% | 9.9% | 32.0% | 1.9% | 29.6% |
LOON compact |
2.2% | 10.4% | 32.0% | 1.9% | 28.2% |
| JTON | 1.8% | 8.4% | 21.1% | 1.6% | 20.8% |
Round-trip fidelity
Encode → decode with each format's standard parser → compare to the original.
The comparator is type-coercion-aware (123 == "123"); structural diffs —
missing keys, dropped nesting, length changes — count as loss.
| Format | Correct | Lossy | Decode error | Fidelity |
|---|---|---|---|---|
| JSON | 10/10 | 0 | 0 | 100% |
| JSON compact | 10/10 | 0 | 0 | 100% |
| YAML | 10/10 | 0 | 0 | 100% |
| XML | 7/10 | 3 | 0 | 70% |
| CSV | 0/10 | 1 | 9 | 0% |
| TOON | 10/10 | 0 | 0 | 100% |
LOON llm |
10/10 | 0 | 0 | 100% |
LOON full |
10/10 | 0 | 0 | 100% |
LOON local |
10/10 | 0 | 0 | 100% |
LOON compact |
10/10 | 0 | 0 | 100% |
| JTON | 0/10 | 10 | 0 | 0% |
All four LOON modes round-trip every dataset losslessly — matching JSON / TOON, ahead of XML (lossy on nested), CSV (no standard parser), and JTON (lossy on all).
[!NOTE] The retrieval runs use reduced deterministic question sets (10–12 Q) to bound API cost. At this size every lossless format reaches 100% accuracy, so these results establish token efficiency at parity accuracy — not an accuracy ranking. Larger question sets and weaker models are needed to separate formats on comprehension.
Quick Start
npm install loon-coreimport { Loon } from 'loon-core';
const loon = new Loon();
const data = [
{ id: 1, name: 'Alice', dept: 'Engineering', salary: 120000, active: true },
{ id: 2, name: 'Bob', dept: 'Sales', salary: 98000, active: false },
];
// LLM consumption (default for prompts)
loon.toLOON(data, { mode: 'llm' });
// Bulk transmission / storage (not for raw LLM reading)
loon.toLOON(data, { mode: 'full' });
// Small / irregular data
loon.toLOON(data, { mode: 'compact' });
// Decode (auto-detects the mode)
loon.fromLOON(encoded);If mode is omitted it is auto-selected: compact for empty or non-uniform
input, micro for 1–4 rows, full for ≥ 5 uniform rows. Override with
{ mode: 'llm' } when the target is a model.
CLI
No installation required — use it instantly with npx:
npx loon-core data.jsonOr install globally and use the loon command:
npm install -g loon-coreOptions
| Flag | Alias | Description |
|---|---|---|
--from <fmt> |
-f |
Input format: json csv xml yaml loon (auto-detect) |
--to <fmt> |
-t |
Output format: loon json csv xml yaml (default: loon) |
--mode <mode> |
-m |
Encoding mode: full llm compact (default: auto) |
--output <file> |
-o |
Write output to file instead of stdout |
--indent <n> |
-i |
JSON output indentation (default: 2) |
--stats |
-s |
Show token estimate and savings after encoding |
--verbose |
-v |
Print full stack traces on errors |
--help |
-h |
Show help |
Token statistics
Pass --stats to see a token estimate before and after encoding. Uses a
chars/4 heuristic — fast, no API key required.
✔ data.json → output.loon
ℹ Token estimate: ~15,145 (json) → ~8,745 (loon)
✔ Saved ~6,400 tokens (-42%) [strong]Output is colour-coded in TTY terminals and plain text when piped.
Examples
# JSON → LOON (auto mode)
loon data.json
# JSON → LOON llm mode + token stats
loon data.json -m llm --stats
# JSON → LOON full compression, write to file
loon data.json -m full -o output.loon
# CSV → LOON
loon data.csv -f csv -m llm
# LOON → JSON
loon data.loon -t json
# Pipe from stdin
echo '[{"id":1,"name":"Ada","role":"dev"}]' | loon
# Round-trip: JSON → LOON → JSON
cat data.json | loon | loon -f loon -t jsonWire Format
full mode — maximum compression
The format that lives in a .loon file or rides between two services.
S:@T1[N]=[col:type,...] schema: row count + columns with type codes
A:fullName,... column aliases (only if names were abbreviated)
DC:col,... integer columns stored decimal, not Base36
C:col=value constant column (omitted from every row)
Q:col=start,step integer arithmetic sequence
QF:col=start,step float arithmetic sequence
QS:col=start,step,prefix string sequence (prefix + counter)
FP:d=col,... fixed-point: row token ÷ 10^d = value
X:col=suffix common suffix stripped, re-appended on decode
D:col={tok:val,...} semantic dictionary (token → value)
D:__global__={tok:prefix} shared prefix dictionary (backs `$tok` cells)
D:defaults=col=val,... per-column default; `~` in a row means "use it"
DL:col=firstValue delta encoding (row tokens are signed deltas)
NM:col=mean,std,sigmaT,mT z-score normalization (LOSSY)
LY:NM marks payload contains lossy NORM columns
AS:col=k1,k2,... uniform object-array sub-schema (see below)
@T1: start of the data block
F:csv rows are comma-delimited
<data rows>
fullis not meant to be read by an LLM directly. If you must, prependgetSpec(encoded)to the prompt — and even then expect output-token costs to be higher thanllmmode, because the model has to mentally decode every Base36 / sequence / dictionary cell.
llm mode — self-evident, LLM-readable
One header line. Plain decimal rows. JSON-style literals.
@T1[N]{id,name,email,dept,salary,active}
C:status=active (optional, when a constant column exists)
AS:items=sku,name,qty (optional, for object-array columns)
1,Alice,alice@x.com,Engineering,120000,true
2,Bob,bob@x.com,Sales,98000,false
3,Carol,carol@x.com,Marketing,null,trueRules the model can apply at sight:
- Numbers bare:
120000. Decoded asNumber(token). - Booleans literal:
true/false. - Null literal:
null(same single BPE token JSON uses). - Strings bare when unambiguous:
Alice. - Strings that look like numbers / bools / null are quote-wrapped:
"123","true","null". The decoder strips the quotes and keeps the string as a string. - Absent key in this row (sparse / non-uniform):
+. - Inline object (
col:o): the cell is plain JSON ({...}) — read as-is. #ex row1 → …(when present): row 1 already decoded, column by column. A#-prefixed line is not data — the decoder skips it. It is emitted only for non-trivial schemas (wide, or with extracted constants /:o/:acolumns) to anchor the positional mapping on the first read, cutting the reasoning a model would otherwise spend aligning values to columns.
No :type codes in the header — the model infers types from cell shape. That
alone saves ~2 tokens per column (the :a array and :o inline-object tags are
the two exceptions — type inference cannot recover a pipe-joined array or
distinguish a JSON-object cell from a string). No DC: / @T1: / F:csv
scaffolding either; the @…{…} line is its own start marker.
Deep, heterogeneous arrays — sparse-subtree folding
Flattening every nested object to dot-notation explodes a heterogeneous array
(e.g. GitHub events: PushEvent, ForkEvent, …) into a giant column union
where most columns are absent (+) on most rows. That bloats the header and
makes a model transcribe the structure in its answer (large output cost). LOON
folds a nested subtree back to a single inline :o object column when it is
both deep (nested past the top level) and sparse (present in < 50% of
rows). Dense or shallow structure still expands to dot-notation columns, where
schema-once dedup pays off. On github.json (30 events) this cut the schema from
100+ columns to 32 with lossless round-trip.
compact mode — small or irregular data
id: 1
name: Alice
tags[3]: a,b,c scalar array, length 3
items[2]{sku,qty}: A,1;B,2 uniform object array
---
id: 2
...Single deep objects (configs) use an indented hierarchy instead of repeated dot-notation keys.
Row tokens (shared)
| Token | Meaning |
|---|---|
^ |
null (full mode) — llm mode writes the literal null |
~ |
use the column default (full only) |
+ |
key absent from this row — omit it (sparse / non-uniform schema) |
!value |
raw literal string (bypass the dictionary; full only) |
. |
a row that is entirely defaults |
*N[...] |
run-length: repeat the bracketed row N times (full only) |
AS: — uniform object-array sub-schema
A column holding an array of same-shape objects (items: [{sku,name,qty}, …])
would otherwise be inline JSON with the keys repeated on every element. AS:
declares the shared shape once; each cell carries values only:
AS:items=sku,name,qty
cell: A|Mouse|1;B|Cable|2 → [{sku:A,name:Mouse,qty:1},{sku:B,name:Cable,qty:2}]Fields within an object are |-separated; objects are ;-separated.
Type codes (full mode)
i integer · f float · s string · b boolean · a array · o object
llm mode omits the codes — types are inferred from cell shape.
getSpec() — when full must talk to an LLM
full is built for parsers. If you need an LLM to read a full payload (for
example: you stored data in .loon, you now want a model to query it),
getSpec(encoded) returns a minimal decode spec (200–600 tokens) covering only
the headers that this specific payload actually uses, plus a worked walkthrough
of row 0.
import { getSpec } from 'loon-core';
const encoded = loon.toLOON(data, { mode: 'full' });
const spec = getSpec(encoded); // { text, sections, estimatedTokens }
// prepend spec.text to the prompt; the model can now decode `full`.llm mode does not need a spec — that is the whole point of it.
Sessions & Context Caching
For repeated calls against the same data shape, LoonSession separates the
cacheable prompt prefix from the per-call data block. LLM providers cache an
identical prefix and bill it at a fraction of the normal rate; put the spec and
the schema there and the per-call cost shrinks to just the rows.
import { LoonSession } from 'loon-core';
const s = new LoonSession();
s.init(firstBatch, { mode: 'full' });
// System prompt (mark it for prompt caching): s.primer ← getSpec() + schema
// User message 1: s.dataBlock
for (const batch of moreBatches) {
send(s.encodeRows(batch).dataBlock); // only the rows
}splitLoon(encoded) exposes the raw { schema, dataBlock } split for custom
integrations.
Prompt caching reduces input cost. It does not touch output tokens. A model that has to reason hard about a format still pays full price on output. That is why
llmmode beatsfull + cached specfor direct LLM consumption:llm's output token cost is small because the format requires no reasoning to read.
API
| Method | Behavior |
|---|---|
toLOON(data, opts?) / encode |
JSON array → LOON |
fromLOON(loon) / decode |
LOON → JSON array |
fromCSV / fromXML / fromYAML |
other formats → LOON |
toCSV / toXML / toYAML |
LOON → other formats |
fromTree(tree, opts?) |
tree (nested objects) → LOON (TREE: header) |
toTree(loon) |
LOON → tree |
chunk(data, opts) |
split into context-window-sized LOON chunks |
encodeStream / fromLOONStream |
async streaming codec |
getSpec(loon) |
minimal decode spec for a payload |
LoonSession |
multi-call session: primer, dataBlock, encodeRows, decode |
splitLoon(loon) |
{ schema, dataBlock, full } |
validateDecode(loon, rows) |
post-decode structural check |
repairHint(loon, errors) |
minimal retry prompt for an LLM that mis-decoded |
reset() |
clear per-instance schema state |
Options (LoonOptions)
| Option | Effect |
|---|---|
mode |
force full / llm / compact / compat |
fields |
column projection |
maxDecimals |
trim float precision before encoding |
tableId |
override the default schema id (T1) |
outFile |
write encoded output to a file (Node) |
checkpointEvery |
emit a #CKP: schema checkpoint every N rows (full) |
primaryCols |
promote columns to the front + force decimal |
norm |
z-score normalize float columns (lossy; full only) |
Architecture
Input (JSON / CSV / XML / YAML / tree)
│
Mode Selector ──────────────┐
│ │ (auto-pick when mode omitted)
┌────┴─────┬──────────┐ │
▼ ▼ ▼ ▼
full llm compact compat
│ │ │ │
└── Adaptive ─┘ │ │
pipeline │ │
│ │ │
└─────────────────┴─────┘
│
LOON output| Module | Path | Role |
|---|---|---|
| Public API | src/index.ts |
Loon facade, mode routing |
| Adaptive encoder | src/encoder/adaptive/ |
analyzer → header → rows |
| Adaptive decoder | src/decoder/adaptive/ |
header-parser → row-reconstructor |
| Compact codec | src/encoder/compact.ts, src/decoder/compact.ts |
key: value + indent |
| Adaptive engine | src/compression/adaptive.ts |
cell compress/decompress, dictionaries, AS: tabular |
| Mode selector | src/compression/mode-selector.ts |
dataset-shape heuristics |
| State manager | src/state/state-manager.ts |
per-schema context + reverse-dict cache |
| Spec generator | src/utils/get-spec.ts |
getSpec() minimal decode spec |
| Session | src/session.ts |
LoonSession, splitLoon |
| Codecs | src/codecs/ |
CSV / XML / YAML / tree bridges |
full and llm share one analysis pipeline. The analyzer gates off every
compute-requiring primitive (Base36, sequences, dictionaries, defaults,
suffixes, fixed-point, delta, NORM, RLE, anchor rows) when the mode is llm.
What survives is structural-only: the schema, C: constants, and AS:
sub-schemas. The header emitter also drops :type codes and replaces ^ (null
sentinel) with the literal null for llm mode.
License
MIT © LOON Thesis Team