Package Exports

@openai-hce/encode

Readme

@openai-hce/encode

Hierarchical Columnar Encoding (HCE) encoder for compressing structured JSON payloads. Converts JSON arrays into a column-oriented string representation that typically delivers 40–60 % smaller payloads compared to raw JSON, while keeping the layout deterministic for fast decoding.

Installation

npm install @openai-hce/encode

pnpm add @openai-hce/encode
yarn add @openai-hce/encode

Quick start

import { HCEEncoder } from '@openai-hce/encode';

const data = [
    { type: 'user', id: 1, name: 'Alice', role: 'admin' },
    { type: 'user', id: 2, name: 'Bob', role: 'user' }
];

const encoder = new HCEEncoder();
const hce = encoder.encode(data, 'users');

console.log(hce);
// users(user)[2]:
//   user(id,name,role)[2]:
//     1,Alice,admin|2,Bob,user

Why HCE

Columnar compression for repeated schemas.
Automatic grouping of similar records.
Fully typed TypeScript API.
Deterministic output suitable for caching.
Zero runtime dependencies.

API overview

`new HCEEncoder(options?: HCEOptions)`

interface HCEOptions {
    fieldDelimiter?: string;      // default: ','
    recordDelimiter?: string;     // default: '|'
    nestedDelimiter?: string;     // default: ';'
    missingValue?: string;        // default: ' '
    flattenNested?: boolean;      // default: true
    typeField?: string;           // default: 'type'
    autoDetectGrouping?: boolean; // default: true
    preferTypeGrouping?: boolean; // default: true
    minGroupSizeForSecondaryGrouping?: number; // default: 5
    schemaUniformityThreshold?: number;        // default: 0.9
}

`encoder.encode(data, rootKey?)`

data: array of objects, or { key: [...] }.
rootKey: optional explicit group name.
Returns an HCE string.

Examples

Type-only grouping is implicit

The encoder never prints by type; type grouping is the default.

const products = {
    products: [
        { type: 'product', name: 'Laptop', price: 999 },
        { type: 'product', name: 'Phone', price: 599 }
    ]
};

console.log(new HCEEncoder().encode(products));

Output:

products(product)[2]:
    product(name,price)[2]:
        Laptop,999|Phone,599

Notice the absence of by type in the header—type is implicit.

Secondary grouping adds `by {field}`

When a good secondary field (category, role, status, …) exists, the encoder emits the by suffix and suppresses that field inside each group.

const groupedProducts = {
    products: [
        { type: 'product', category: 'Electronics', name: 'Laptop', price: 999 },
        { type: 'product', category: 'Electronics', name: 'Phone', price: 599 },
        { type: 'product', category: 'Books', name: 'JS Guide', price: 39 },
        { type: 'product', category: 'Books', name: 'TS Handbook', price: 45 }
    ]
};

console.log(new HCEEncoder().encode(groupedProducts));

Output:

products(product by category)[4]:
    Electronics(name,price)[2]:
        Laptop,999|Phone,599
    Books(name,price)[2]:
        JS Guide,39|TS Handbook,45

Uniform schemas stay type-only

If every record shares the same shape, the encoder honours preferTypeGrouping and keeps a single type group—even if a secondary field exists.

const uniformUsers = {
    users: [
        { type: 'user', role: 'admin', name: 'Alice', age: 30 },
        { type: 'user', role: 'admin', name: 'Bob', age: 25 },
        { type: 'user', role: 'user', name: 'Charlie', age: 35 }
    ]
};

console.log(new HCEEncoder().encode(uniformUsers));

Output:

users(user)[3]:
    user(age,name,role)[3]:
        30,Alice,admin|25,Bob,admin|35,Charlie,user

Multi-type collections list every type

const items = {
    items: [
        { type: 'book', title: 'HCE Guide', pages: 200 },
        { type: 'product', name: 'Laptop', price: 999 },
        { type: 'service', name: 'Consulting', rate: 150 }
    ]
};

console.log(new HCEEncoder().encode(items));

Output:

items(book,product,service)[3]:
    book(pages,title)[1]:
        200,HCE Guide
    product(name,price)[1]:
        Laptop,999
    service(name,rate)[1]:
        Consulting,150

Edge case: single-valued secondary field

const adminsOnly = {
    users: [
        { type: 'user', role: 'admin', name: 'Alice' },
        { type: 'user', role: 'admin', name: 'Bob' }
    ]
};

console.log(new HCEEncoder().encode(adminsOnly));

Output:

users(user)[2]:
    user(name,role)[2]:
        Alice,admin|Bob,admin

The encoder keeps type-only grouping because role has only one value.

Edge case: missing type field

Objects without a type field still encode safely—the encoder collapses the output to a single header and preserves your original schema.

const unnamed = {
    products: [
        { name: 'Laptop', price: 999 },
        { name: 'Phone', price: 599 }
    ]
};

console.log(new HCEEncoder().encode(unnamed));

Output:

products(name,price)[2]:
    Laptop,999|Phone,599

Nested objects and arrays

const posts = [
    {
        type: 'post',
        id: 1,
        title: 'Hello World',
        author: { name: 'Alice', team: 'Platform' },
        tags: ['intro', 'hce']
    },
    {
        type: 'post',
        id: 2,
        title: 'Encoder Tips',
        author: { name: 'Bob', team: 'SDK' },
        tags: ['guide']
    }
];

console.log(new HCEEncoder().encode(posts, 'posts'));

Output:

posts(post)[2]:
    post(id,title,.author,.tags)[2]:
        1,'Hello World'|2,'Encoder Tips'
        .author(name,team)[2]:
            Alice,Platform|Bob,SDK
        .tags: intro;hce|guide

Custom delimiters

const encoder = new HCEEncoder({
    fieldDelimiter: '\t',
    recordDelimiter: '\n',
    nestedDelimiter: ',',
});

console.log(encoder.encode([{ type: 'row', id: 1, value: 'A' }], 'rows'));

Output:

rows(row)[1]:
    row(id\tvalue)[1]:
        1\tA

Tips

Provide { users: [...] } if you want the root key to match an existing JSON property automatically.
Disable autoDetectGrouping when deterministic single-group output is required.
Decode using @openai-hce/decode for round-trip conversions.

Grouping configuration cheat sheet

Option	Default	Effect
`autoDetectGrouping`	`true`	Finds the best grouping field automatically. Set `false` to force type-only grouping.
`preferTypeGrouping`	`true`	Keeps uniform data in a single type group. Set `false` to allow aggressive secondary grouping.
`schemaUniformityThreshold`	`0.9`	Minimum proportion of records sharing the same schema before the encoder prefers type-only grouping.
`minGroupSizeForSecondaryGrouping`	`5`	Average group size needed to justify a secondary field split. Lower it to accept smaller groups.
`typeField`	`'type'`	Name of the discriminator field. Change when the source data uses `kind`, `category`, etc.

JSPM