Package Exports
- @stnickza/pricepatrol-parser
 - @stnickza/pricepatrol-parser/browser
 - @stnickza/pricepatrol-parser/processing
 
Readme
@pricepatrol/parser
A comprehensive structured data parsing library for the Price Patrol ecosystem. This library provides robust extraction and processing capabilities for structured data from web pages, including JSON-LD, meta tags, data layers, and microdata.
Features
- ๐ Universal Compatibility: Works in both browser and Node.js environments
 - ๐ฏ Multiple Data Sources: Supports JSON-LD, meta tags, data layers, and microdata
 - ๐ง Flexible Processing: Advanced path evaluation and data transformations
 - ๐ Confidence Scoring: Built-in confidence metrics for extraction reliability
 - ๐งช Well Tested: Comprehensive test suite with 100% coverage
 - ๐ฆ Tree Shakable: Separate exports for browser and processing functionality
 
Installation
npm install @pricepatrol/parserQuick Start
Browser Environment
import { createBrowserExtractor, StructuredDataProcessor } from '@pricepatrol/parser';
// Extract data from current page
const extractor = createBrowserExtractor();
const structuredData = extractor.extractAll();
// Process with custom selectors
const processor = new StructuredDataProcessor();
const selectors = {
  productName: { jsonLd: '0.name' },
  price: { jsonLd: '0.offers.price', transformations: [{ type: 'parseNumber' }] },
  brand: { metaTags: 'product:brand' }
};
const result = processor.processRecipe(structuredData, selectors);
console.log(result.productName?.value); // Extracted product nameNode.js Environment
import { StructuredDataProcessor } from '@pricepatrol/parser/processing';
const processor = new StructuredDataProcessor();
// Process pre-extracted structured data
const structuredData = {
  jsonLd: [{ "@type": "Product", "name": "Example Product", "offers": { "price": "29.99" } }],
  metaTags: { "og:title": "Example Product Page" },
  dataLayers: { dataLayer: [{ product: { name: "Example" } }] },
  url: "https://example.com/product",
  pageTitle: "Example Product",
  timestamp: new Date().toISOString(),
  extractorVersion: "1.0.0"
};
const result = processor.processSelector(structuredData, {
  jsonLd: '0.name'
});
console.log(result?.value); // "Example Product"API Reference
Core Classes
StructuredDataProcessor
Main processing class for extracting data using selectors.
const processor = new StructuredDataProcessor();
// Process single field
const field = processor.processSelector(data, selector);
// Process multiple fields
const results = processor.processRecipe(data, selectors);
// Validate data structure
const isValid = StructuredDataProcessor.validateStructuredData(data);BrowserDataExtractor
Browser-specific extraction from DOM elements.
const extractor = new BrowserDataExtractor(document, window);
// Extract all structured data
const data = extractor.extractAll();
// Check capabilities
const capabilities = extractor.getCapabilities();
// Extract with custom CSS selectors
const customData = extractor.extractCustomData({
  title: 'h1.product-title',
  price: '.price-amount'
});Selector Format
Selectors define how to extract data from different structured data sources:
interface FieldSelector {
  jsonLd?: string;           // JSON-LD path (e.g., "0.offers.price")
  metaTags?: string;         // Meta tag key (e.g., "og:price:amount")
  dataLayers?: string;       // Data layer path (e.g., "dataLayer.0.product.name")
  microdata?: string;        // Microdata property name
  regex?: string;            // Post-processing regex
  transformations?: FieldTransformation[]; // Data transformations
}Data Transformations
Apply transformations to extracted values:
const selector = {
  jsonLd: '0.offers.price',
  transformations: [
    { type: 'regex', pattern: '([0-9.]+)', flags: 'g' },
    { type: 'parseNumber' },
    { type: 'trim' }
  ]
};Available transformations:
regex: Apply regular expressionreplace: String replacementtrim: Remove whitespacelowercase/uppercase: Case conversionparseNumber: Convert to numberparseBoolean: Convert to boolean
Path Evaluation
The library supports complex path evaluation for nested data:
import { evaluateStructuredDataPath } from '@pricepatrol/parser';
const data = {
  products: [
    { name: "Product 1", offers: [{ price: "10.99" }] }
  ]
};
// Extract nested array data
const price = evaluateStructuredDataPath(data, 'products[0].offers[0].price');
console.log(price); // "10.99"Modules
Universal Processing (@pricepatrol/parser/processing)
Core data processing functionality that works in any JavaScript environment:
StructuredDataProcessorevaluateStructuredDataPath- Type definitions
 
Browser Extraction (@pricepatrol/parser/browser)
Browser-specific DOM extraction functionality:
BrowserDataExtractorcreateBrowserExtractorextractJsonLdDataextractMetaTagsextractDataLayersextractMicrodata
Confidence Scoring
The library provides confidence scores for extracted data based on the source:
- JSON-LD: 0.9 (highest confidence)
 - Meta Tags: 0.8
 - Data Layers: 0.7
 - Microdata: 0.6
 - Custom Data Layers: 0.5
 
const result = processor.processSelector(data, selector);
console.log(result?.confidence); // 0.9 for JSON-LD sourceBrowser Compatibility
- Modern Browsers: Chrome 80+, Firefox 75+, Safari 13+, Edge 80+
 - Node.js: 18.0.0+
 - JSDOM: Supported for server-side testing
 
Contributing
- Fork the repository
 - Create a feature branch: 
git checkout -b feature/new-feature - Run tests: 
npm test - Commit changes: 
git commit -am 'Add new feature' - Push to branch: 
git push origin feature/new-feature - Submit a pull request
 
Testing
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
# Generate coverage report
npm run test:coverageLicense
MIT License - see LICENSE file for details.
Changelog
1.0.0
- Initial release
 - Core structured data processing
 - Browser DOM extraction
 - Comprehensive test suite
 - TypeScript support