Package Exports
- xml-introspect
- xml-introspect/browser
Readme
xml-introspect
TypeScript library and CLI for analyzing large XML files and generating representative samples.
Install
npm install xml-introspectQuick Start
import { XMLIntrospector } from 'xml-introspect';
const introspector = new XMLIntrospector();
await introspector.generateSample('input.xml', 'sample.xml', {
maxElements: 100,
maxDepth: 3
});
await introspector.generateSchema('input.xml', 'schema.xsd');CLI Usage
xml-introspect sample input.xml output.xml
xml-introspect schema input.xml output.xsd
xml-introspect sample https://en-word.net/static/english-wordnet-2024.xml.gz sample.xmlAPI
Core Methods:
// Analyze structure
const analysis = await introspector.analyzeStructure('input.xml');
// Generate sample
await introspector.generateSample('input.xml', 'output.xml', {
maxElements: 100,
maxDepth: 3,
strategy: 'balanced'
});
// Generate schema
await introspector.generateSchema('input.xml', 'schema.xsd', {
namespace: 'http://example.com/schema'
});
// Validate XML
const isValid = await introspector.validateXML('data.xml', 'schema.xsd');
// Generate realistic data
await introspector.generateRealisticXML('template.xml', 'realistic.xml', {
seed: 42,
maxElements: 200
});Data Processing:
import { FormatProcessor } from 'xml-introspect/data-loader';
const processor = new FormatProcessor();
const result = await processor.processData(arrayBuffer, {
projectId: 'oewn:2024',
enableTarExtraction: true
});Options
Sampling:
maxElements- Max elements (default: 100)maxDepth- Max depth (default: 5)strategy- 'balanced', 'random', or 'first'
Schema:
namespace- Target namespaceelementForm- 'qualified' or 'unqualified'
Features
- XML Analysis: Structure analysis and sampling
- XSD Generation: Create schemas from XML
- Real Data: Process WordNet LMF files
- Memory Efficient: Streams large files
- TypeScript: Full type safety
Development
pnpm install
pnpm test
pnpm buildLicense
MIT - see LICENSE