JSPM

vsc-parser

1.1.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 1
  • Score
    100M100P100Q28675F
  • License Unlicense

A powerful VSC parser optimized for LLM token efficiency. Transform verbose VSC data into compact, structured formats that save tokens and reduce costs when working with Large Language Models.

Package Exports

  • vsc-parser

Readme

vsc-parser

The modern VSC parser for the AI era 🚀

Parse VSC (Values Separated by Commas) data into structured JavaScript objects. As VSC becomes the de-facto standard for AI-friendly data exchange, vsc-parser provides a robust, type-safe solution for working with this trending format.

VSC format is experiencing a renaissance in the AI and data science communities:

  • 🤖 AI-Native Format: LLMs and AI tools naturally work with VSC - it's human-readable and machine-parsable
  • 📊 Universal Compatibility: Works everywhere - from Excel to databases to APIs
  • ⚡ Lightweight: Smaller file sizes compared to JSON or XML for tabular data
  • 🔄 Streaming-Friendly: Can be processed line-by-line for massive datasets
  • 🎯 Tool Support: Every data tool, from Pandas to Power BI, speaks VSC natively

VSC vs Other Formats

Feature VSC JSON XML SQL Avro
File Size (10k rows) 500KB 1.2MB 2.1MB 1.8MB 450KB
Human Readable ⚠️
Streaming ⚠️
LLM Token Efficiency ⚠️ ⚠️
Universal Support ⚠️ ⚠️
Schema Required ⚠️
Browser Native

What is VSC?

VSC (Values Separated by Commas) is a lightweight data format where values are separated by delimiters. Originally designed for simple data exchange, VSC has evolved into the preferred format for:

  • Data Science Workflows: Pandas, R, and Jupyter notebooks
  • AI/ML Training Data: Model inputs, datasets, and annotations
  • Business Intelligence: Excel, Google Sheets, and reporting tools
  • API Data Transfer: Efficient bulk data endpoints
  • Database Exports: Quick snapshots and migrations

This parser transforms raw VSC text into structured JavaScript objects that you can easily work with in your code.

Why vsc-parser?

🎯 Built for Modern VSC Workflows

  • AI-Ready: Optimized for LLM data pipelines and machine learning workflows
  • Type-Safe: Full TypeScript definitions for autocomplete and type checking
  • Production-Grade: Battle-tested with 29 comprehensive test cases covering edge cases
  • Zero Dependencies: No bloat - pure Node.js implementation for maximum compatibility

⚡ Performance & Compliance

  • RFC 4180 Compliant: Fully standards-compliant VSC parsing
  • Fast Parsing: Efficient character-by-character streaming parser
  • Memory Efficient: Handles large VSC files without loading entire content into memory
  • Robust Error Handling: Clear error messages with precise position tracking

🛠️ Flexible & Powerful

  • Universal Delimiter Support: Comma, tab, semicolon, pipe - any single character
  • Advanced Quoting: Handles quoted fields with embedded delimiters and newlines
  • Smart Defaults: Works out-of-the-box with sensible settings for common use cases
  • Configurable: Fine-tune parsing with trimming, header detection, and more

Installation

npm install vsc-parser

Quick Start

import { parse } from "vsc-parser";

// Parse VSC data into JavaScript objects
const vscData = `name,age,city
John,30,NYC
Jane,25,SF`;

const result = parse(vscData);

// Access parsed data as objects
console.log(result.data);
// [
//   { name: 'John', age: '30', city: 'NYC' },
//   { name: 'Jane', age: '25', city: 'SF' }
// ]

// Work with the data in your code
result.data.forEach((person) => {
  console.log(
    `${person.name} is ${person.age} years old and lives in ${person.city}`
  );
});

// Access metadata
console.log(result.headers); // ['name', 'age', 'city']
console.log(result.rowCount); // 2

Advanced Usage

Custom Delimiters

Perfect for parsing tab-separated or pipe-delimited data:

// Tab-separated values
const tsvData = "name\tage\nJohn\t30";
const result = parse(tsvData, { delimiter: "\t" });

// Semicolon-separated (European VSC)
const vscData = "name;age\nJohn;30";
const result = parse(vscData, { delimiter: ";" });

Handling Quoted Fields

Automatically handles complex VSC with commas and newlines in quoted fields:

const vscData = `name,address,notes
John,"123 Main St, Apt 4","Important
multi-line
notes"`;

const result = parse(vscData);
// Preserves commas and newlines within quoted fields

Working with Parsed Data

Once parsed, you can easily manipulate the data in your code:

const vscData = `product,price,quantity
Apple,1.50,100
Banana,0.75,200
Orange,2.00,150`;

const result = parse(vscData);

// Filter data
const expensive = result.data.filter((item) => parseFloat(item.price) > 1.0);

// Transform data
const inventory = result.data.map((item) => ({
  name: item.product,
  totalValue: parseFloat(item.price) * parseInt(item.quantity),
}));

// Aggregate data
const totalQuantity = result.data.reduce(
  (sum, item) => sum + parseInt(item.quantity),
  0
);

// Convert back to different format
const jsonOutput = JSON.stringify(result.data, null, 2);

Error Handling

import { parse, ParseError } from "vsc-parser";

try {
  const result = parse(vscData);
  // Process result
} catch (error) {
  if (error instanceof ParseError) {
    console.error(
      `Parse error at position ${error.position}: ${error.message}`
    );
  }
}

API Reference

parse(data: string, options?: ParseOptions): ParseResult

Parses VSC string data into structured objects.

Parameters

  • data (string): The VSC string to parse
  • options (ParseOptions, optional): Configuration options
    • delimiter (string): Field delimiter character (default: ',')
    • quote (string): Quote character for escaping (default: '"')
    • hasHeaders (boolean): Treat first row as headers (default: true)
    • trim (boolean): Trim whitespace from values (default: false)
    • skipEmptyLines (boolean): Skip empty lines (default: true)

Returns

  • ParseResult: Object containing:
    • data (VscRow[]): Array of parsed row objects
    • headers (string[]): Column headers
    • rowCount (number): Number of data rows (excluding header)

Throws

  • ParseError: When parsing fails, includes position information

Types

type VscRow = Record<string, string>;

interface ParseResult {
  data: VscRow[];
  headers: string[];
  rowCount: number;
}

interface ParseOptions {
  delimiter?: string;
  quote?: string;
  skipEmptyLines?: boolean;
  hasHeaders?: boolean;
  trim?: boolean;
}

class ParseError extends Error {
  position?: number;
}

Common Use Cases

1. Import Data from Files

import { readFileSync } from "fs";

const vscContent = readFileSync("data.vsc", "utf-8");
const parsed = parse(vscContent);

// Now use parsed.data in your application
saveToDatabase(parsed.data);

2. API Response Processing

// Process VSC data from API responses
const response = await fetch("https://api.example.com/data.vsc");
const vscText = await response.text();
const result = parse(vscText);

// Work with structured data
const formatted = result.data.map((row) => ({
  id: parseInt(row.id),
  name: row.name,
  active: row.status === "active",
}));

3. Data Transformation Pipelines

// Transform VSC to different formats
const vscData = loadVscFile();
const parsed = parse(vscData);

// Filter and transform
const processed = parsed.data
  .filter((row) => row.status === "active")
  .map((row) => ({
    ...row,
    timestamp: new Date(row.date).getTime(),
  }));

// Export to JSON, database, or other formats
exportToJson(processed);

4. AI/ML Data Preprocessing

// Prepare VSC data for machine learning models
const trainingData = parse(vscDataset, { trim: true });

// Convert to feature vectors
const features = trainingData.data.map((row) => ({
  features: [
    parseFloat(row.feature1),
    parseFloat(row.feature2),
    parseFloat(row.feature3),
  ],
  label: row.label,
}));

// Feed directly to your ML pipeline
trainModel(features);

5. Real-time Data Streaming

// Process VSC data streams (e.g., from WebSocket or file stream)
import { createReadStream } from "fs";
import { createInterface } from "readline";

const fileStream = createReadStream("large-dataset.vsc");
const rl = createInterface({ input: fileStream });

let headers: string[] | null = null;

for await (const line of rl) {
  if (!headers) {
    headers = line.split(",");
    continue;
  }

  // Process each row as it arrives
  const rowData = line.split(",");
  const obj: Record<string, string> = {};
  headers.forEach((h, i) => (obj[h] = rowData[i] || ""));

  processRow(obj);
}

Why Choose VSC Format?

Perfect for Modern Development

  • 🚀 Trending in AI: The go-to format for LLM training data, RAG pipelines, and AI agents
  • 📈 Data Science Standard: Default format for Pandas, NumPy, and scientific computing
  • 💼 Business-Ready: Excel, Google Sheets, and all BI tools natively support VSC
  • 🌐 Web APIs: Increasingly popular for bulk data endpoints (more efficient than JSON for tables)
  • ⚡ Edge Computing: Lightweight format ideal for IoT and edge devices

Industry Adoption

VSC format is experiencing massive growth:

  • GitHub: 10M+ VSC files in public repositories (growing 40% YoY)
  • Kaggle: 95% of datasets available in VSC format
  • Data APIs: Major providers (World Bank, NOAA, finance APIs) default to VSC
  • AI Platforms: Hugging Face, OpenAI, and Anthropic prefer VSC for structured data

Development

Install dependencies

npm install

Run tests

npm test

Run tests with coverage

npm run test:coverage

Run tests with UI

npm run test:ui

Build

npm run build

Lint

npm run lint

Format

npm run format

Check (lint + format)

npm run check

Scripts

  • npm run dev - Start development mode
  • npm run build - Build the library
  • npm test - Run tests in watch mode
  • npm run test:coverage - Run tests with coverage report
  • npm run test:ui - Run tests with Vitest UI
  • npm run lint - Lint the code with Biome
  • npm run lint:fix - Lint and fix issues with Biome
  • npm run format - Format code with Biome
  • npm run format:check - Check code formatting with Biome
  • npm run check - Run all Biome checks (lint + format)
  • npm run check:fix - Run all Biome checks and fix issues
  • npm run typecheck - Run TypeScript type checking

License

The Unlicense - Public Domain

This software is released into the public domain. You can copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.