JSPM

@openai-hce/encode

1.0.3
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 2
  • Score
    100M100P100Q83157F
  • License MIT

HCE (Hierarchical Columnar Encoding) encoder for efficient data compression

Package Exports

  • @openai-hce/encode

Readme

@openai-hce/encode

Hierarchical Columnar Encoding (HCE) encoder for compressing structured JSON payloads. Converts JSON arrays into a column-oriented string representation that typically delivers 40–60 % smaller payloads compared to raw JSON, while keeping the layout deterministic for fast decoding.

Installation

npm install @openai-hce/encode
pnpm add @openai-hce/encode
yarn add @openai-hce/encode

Quick start

import { HCEEncoder } from '@openai-hce/encode';

const data = [
    { type: 'user', id: 1, name: 'Alice', role: 'admin' },
    { type: 'user', id: 2, name: 'Bob', role: 'user' }
];

const encoder = new HCEEncoder();
const hce = encoder.encode(data, 'users');

console.log(hce);
// users(user)[2]:
//   user(id,name,role)[2]:
//     1,Alice,admin|2,Bob,user

Why HCE

  • Columnar compression for repeated schemas.
  • Automatic grouping of similar records.
  • Fully typed TypeScript API.
  • Deterministic output suitable for caching.
  • Zero runtime dependencies.

API overview

new HCEEncoder(options?: HCEOptions)

interface HCEOptions {
    fieldDelimiter?: string;      // default: ','
    recordDelimiter?: string;     // default: '|'
    nestedDelimiter?: string;     // default: ';'
    missingValue?: string;        // default: ' '
    flattenNested?: boolean;      // default: true
    typeField?: string;           // default: 'type'
    autoDetectGrouping?: boolean; // default: true
    preferTypeGrouping?: boolean; // default: true
    minGroupSizeForSecondaryGrouping?: number; // default: 5
    schemaUniformityThreshold?: number;        // default: 0.9
}

encoder.encode(data, rootKey?)

  • data: array of objects, or { key: [...] }.
  • rootKey: optional explicit group name.
  • Returns an HCE string.

Examples

Type-only grouping is implicit

The encoder never prints by type; type grouping is the default.

const products = {
    products: [
        { type: 'product', name: 'Laptop', price: 999 },
        { type: 'product', name: 'Phone', price: 599 }
    ]
};

console.log(new HCEEncoder().encode(products));

Output:

products(product)[2]:
    product(name,price)[2]:
        Laptop,999|Phone,599

Notice the absence of by type in the header—type is implicit.

Secondary grouping adds by {field}

When a good secondary field (category, role, status, …) exists, the encoder emits the by suffix and suppresses that field inside each group.

const groupedProducts = {
    products: [
        { type: 'product', category: 'Electronics', name: 'Laptop', price: 999 },
        { type: 'product', category: 'Electronics', name: 'Phone', price: 599 },
        { type: 'product', category: 'Books', name: 'JS Guide', price: 39 },
        { type: 'product', category: 'Books', name: 'TS Handbook', price: 45 }
    ]
};

console.log(new HCEEncoder().encode(groupedProducts));

Output:

products(product by category)[4]:
    Electronics(name,price)[2]:
        Laptop,999|Phone,599
    Books(name,price)[2]:
        JS Guide,39|TS Handbook,45

Uniform schemas stay type-only

If every record shares the same shape, the encoder honours preferTypeGrouping and keeps a single type group—even if a secondary field exists.

const uniformUsers = {
    users: [
        { type: 'user', role: 'admin', name: 'Alice', age: 30 },
        { type: 'user', role: 'admin', name: 'Bob', age: 25 },
        { type: 'user', role: 'user', name: 'Charlie', age: 35 }
    ]
};

console.log(new HCEEncoder().encode(uniformUsers));

Output:

users(user)[3]:
    user(age,name,role)[3]:
        30,Alice,admin|25,Bob,admin|35,Charlie,user

Multi-type collections list every type

const items = {
    items: [
        { type: 'book', title: 'HCE Guide', pages: 200 },
        { type: 'product', name: 'Laptop', price: 999 },
        { type: 'service', name: 'Consulting', rate: 150 }
    ]
};

console.log(new HCEEncoder().encode(items));

Output:

items(book,product,service)[3]:
    book(pages,title)[1]:
        200,HCE Guide
    product(name,price)[1]:
        Laptop,999
    service(name,rate)[1]:
        Consulting,150

Edge case: single-valued secondary field

const adminsOnly = {
    users: [
        { type: 'user', role: 'admin', name: 'Alice' },
        { type: 'user', role: 'admin', name: 'Bob' }
    ]
};

console.log(new HCEEncoder().encode(adminsOnly));

Output:

users(user)[2]:
    user(name,role)[2]:
        Alice,admin|Bob,admin

The encoder keeps type-only grouping because role has only one value.

Edge case: missing type field

Objects without a type field still encode safely—the encoder collapses the output to a single header and preserves your original schema.

const unnamed = {
    products: [
        { name: 'Laptop', price: 999 },
        { name: 'Phone', price: 599 }
    ]
};

console.log(new HCEEncoder().encode(unnamed));

Output:

products(name,price)[2]:
    Laptop,999|Phone,599

Nested objects and arrays

const posts = [
    {
        type: 'post',
        id: 1,
        title: 'Hello World',
        author: { name: 'Alice', team: 'Platform' },
        tags: ['intro', 'hce']
    },
    {
        type: 'post',
        id: 2,
        title: 'Encoder Tips',
        author: { name: 'Bob', team: 'SDK' },
        tags: ['guide']
    }
];

console.log(new HCEEncoder().encode(posts, 'posts'));

Output:

posts(post)[2]:
    post(id,title,.author,.tags)[2]:
        1,'Hello World'|2,'Encoder Tips'
        .author(name,team)[2]:
            Alice,Platform|Bob,SDK
        .tags: intro;hce|guide

Custom delimiters

const encoder = new HCEEncoder({
    fieldDelimiter: '\t',
    recordDelimiter: '\n',
    nestedDelimiter: ',',
});

console.log(encoder.encode([{ type: 'row', id: 1, value: 'A' }], 'rows'));

Output:

rows(row)[1]:
    row(id\tvalue)[1]:
        1\tA

Tips

  • Provide { users: [...] } if you want the root key to match an existing JSON property automatically.
  • Disable autoDetectGrouping when deterministic single-group output is required.
  • Decode using @openai-hce/decode for round-trip conversions.

Grouping configuration cheat sheet

Option Default Effect
autoDetectGrouping true Finds the best grouping field automatically. Set false to force type-only grouping.
preferTypeGrouping true Keeps uniform data in a single type group. Set false to allow aggressive secondary grouping.
schemaUniformityThreshold 0.9 Minimum proportion of records sharing the same schema before the encoder prefers type-only grouping.
minGroupSizeForSecondaryGrouping 5 Average group size needed to justify a secondary field split. Lower it to accept smaller groups.
typeField 'type' Name of the discriminator field. Change when the source data uses kind, category, etc.

License

MIT © OpenAI HCE Team