Package Exports
- @syntropiq/libpcre-ts
Readme
@syntropiq/libpcre-ts
WebAssembly wrapper for the original PCRE (Perl Compatible Regular Expressions) library with TypeScript bindings.
This package wraps the original PCRE library (not PCRE2), which is more permissive and forgiving than modern regex engines. While the original PCRE is considered "antiquated," this makes it valuable for compatibility with older systems and more lenient pattern matching.
Why Use This?
- Legacy compatibility - Works with patterns that newer regex engines reject
- More permissive - The original PCRE accepts patterns that PCRE2 considers invalid
- WebAssembly performance - Near-native speed in browsers and Node.js
- Full PCRE features - Named groups, lookbehinds, recursion (features missing from JavaScript regex)
- TypeScript support - Complete type definitions included
Perfect for porting legacy regex patterns or when you need maximum pattern compatibility.
Installation
npm install @syntropiq/libpcre-ts
Build & Module Support (ESM & CJS)
- Dual ESM/CJS support: This package now ships with both modern ESM and legacy CommonJS builds, fully tree-shakable and optimized for all environments.
- Automatic WASM handling: The WebAssembly and its JS loader are bundled for both module formats. No manual copying or import hacks needed.
- TypeScript types: Complete type definitions are generated and published for both ESM and CJS consumers.
- Modern build system: Uses Vite for bundling/optimization and TypeScript for type safety. All build, setup, and submodule steps are automated via scripts.
Usage:
- In ESM (Node.js or browser):
import { PCRE } from '@syntropiq/libpcre-ts';
- In CommonJS (Node.js):
const { PCRE } = require('@syntropiq/libpcre-ts');
Quick Start
import { PCRE } from '@syntropiq/libpcre-ts';
const pcre = new PCRE();
await pcre.init();
// Quick pattern testing
const isMatch = pcre.test('\\d+', 'Hello 123');
console.log(isMatch); // true
// Get match details
const matches = pcre.match('(\\w+)\\s+(\\d+)', 'Hello 123');
console.log(matches); // [{ value: 'Hello 123', index: 0, length: 9 }]
Compiled Pattern Usage
For better performance with repeated use:
import { PCRE } from '@syntropiq/libpcre-ts';
const pcre = new PCRE();
await pcre.init();
// Compile pattern once
const regex = pcre.compile('(?P<word>\\w+)\\s+(?P<number>\\d+)');
// Use multiple times
console.log(regex.test('Hello 123')); // true
console.log(regex.test('No numbers here')); // false
// Get matches with named groups
const matches = regex.exec('Hello 123');
if (matches) {
console.log(matches[0].value); // 'Hello 123'
console.log(matches[1].value); // 'Hello' (first capturing group)
console.log(matches[2].value); // '123' (second capturing group)
}
// Get named group mappings
const namedGroups = regex.getNamedGroups();
console.log(namedGroups); // { word: 1, number: 2 }
Advanced Features
// Case-insensitive matching with constants
const regex = pcre.compile('hello', pcre.constants.CASELESS);
console.log(regex.test('HELLO')); // true
// Global matching (find all occurrences)
const allMatches = regex.globalMatch('Hello hello HELLO');
console.log(allMatches.length); // 3
// String replacement
const result = regex.replace('Hello world', 'Hi', true);
console.log(result); // 'Hi world'
PCRE-Specific Features
Features not available in JavaScript's built-in RegExp:
// Named capture groups (Python-style)
const datePattern = '(?P<year>\\d{4})-(?P<month>\\d{2})-(?P<day>\\d{2})';
const regex = pcre.compile(datePattern);
const namedGroups = regex.getNamedGroups();
console.log(namedGroups); // { year: 1, month: 2, day: 3 }
// Lookbehind assertions
const pricePattern = '(?<=\\$)\\d+\\.\\d{2}';
const priceMatch = pcre.match(pricePattern, 'Price: $19.99');
// Recursive patterns (balanced parentheses)
const balancedParens = '\\((?:[^()]++|(?R))*\\)';
const isBalanced = pcre.test(balancedParens, '(a(b(c)d)e)');
Constants and Options
// Common PCRE options
const options =
pcre.constants.CASELESS | // Case-insensitive
pcre.constants.MULTILINE | // ^ and $ match line boundaries
pcre.constants.DOTALL; // . matches newlines
const regex = pcre.compile('pattern', options);
Available constants:
CASELESS
- Case-insensitive matchingMULTILINE
-^
and$
match line boundariesDOTALL
-.
matches newlinesEXTENDED
- Ignore whitespace and commentsUTF8
- Enable UTF-8 modeUNGREEDY
- Make quantifiers non-greedy by default
Error Handling
try {
const regex = pcre.compile('[invalid pattern');
} catch (error) {
console.error('Pattern compilation failed:', error.message);
}
Performance Tips
- Compile once, use many times: Use
pcre.compile()
for patterns you'll reuse - Use quick methods for one-off tests: Use
pcre.test()
andpcre.match()
for single use - Consider options: Some PCRE options can significantly impact performance
Browser Support
Works in all modern browsers that support WebAssembly. For older browsers, include a WebAssembly polyfill.
Node.js Support
Works in Node.js 12+ with WebAssembly support.
Platform Support
Cloudflare Workers are not supported at this time, feel free to submit a PR if you get it working.
Contributing
This library wraps the original PCRE C library. For bug reports or feature requests, please open an issue on GitHub.
License
BSD-3-Clause (same as PCRE)
Note: This library wraps the original PCRE, not PCRE2. While PCRE2 is the modern standard, the original PCRE can be more permissive with certain patterns, making it useful for legacy compatibility and forgiving pattern matching.
'Contact us at support@example.com',
'Send reports to admin@test.org',
'No email here!'
];
emails.forEach((text, i) => {
const result = emailRegex.exec(text, 0);
console.log(`Email ${i + 1}:`, result.success ? result.match : 'Not found');
});
// Important: Clean up memory
emailRegex.delete();
}
API Reference
Quick Functions
// Test if pattern matches (boolean result)
pcre.quickTest(pattern: string, text: string, options: number): boolean
// Get detailed match information
pcre.quickMatch(pattern: string, text: string, options: number): MatchResult
PCRERegex Class
// Compile a regex pattern
const regex = new pcre.PCRERegex(pattern: string, options: number);
// Execute against text
const result = regex.exec(text: string, startOffset: number): MatchResult;
// Clean up (important!)
regex.delete(): void;
Match Result
interface MatchResult {
success: boolean; // Whether the pattern matched
match?: string; // The full matched text
start?: number; // Start position of match
end?: number; // End position of match
groups?: string[]; // Capture groups (numbered)
}
Common Options
pcre.PCRE_CASELESS // Case insensitive matching
pcre.PCRE_MULTILINE // ^ and $ match line boundaries
pcre.PCRE_DOTALL // . matches newlines
pcre.PCRE_EXTENDED // Ignore whitespace in patterns
pcre.PCRE_UTF8 // Enable UTF-8 mode
Node.js vs Browser
Node.js
const PCRE = require('@syntropiq/libpcre-ts');
// Works out of the box
Browser (ES Modules)
import PCRE from '@syntropiq/libpcre-ts';
// Ensure your bundler supports WebAssembly
Browser (Script Tag)
<script src="https://unpkg.com/@syntropiq/libpcre-ts/dist/index.js"></script>
<script>
PCRE().then(pcre => {
// Use pcre here
});
</script>
Memory Management
Important: Always call .delete()
on compiled regex objects to prevent memory leaks:
const regex = new pcre.PCRERegex('pattern', 0);
// ... use regex ...
regex.delete(); // Essential!
// Or use try/finally
const regex = new pcre.PCRERegex('pattern', 0);
try {
// Use regex
} finally {
regex.delete();
}
Error Handling
try {
// Invalid regex pattern
const regex = new pcre.PCRERegex('[invalid', 0);
} catch (error) {
console.error('Pattern compilation failed:', error.message);
}
// Check match results
const result = pcre.quickMatch('pattern', 'text', 0);
if (!result.success) {
console.log('No match found');
}
PCRE vs JavaScript Regex
Feature | JavaScript | PCRE |
---|---|---|
Lookbehind | Limited | Full support |
Named groups | ✅ | ✅ |
Recursion | ❌ | ✅ |
Unicode properties | Limited | Full |
Pattern strictness | Strict | Permissive |
Contributing
- All build and setup is automated via scripts in the
scripts/
directory. - See
PLAN.md
andTODO.md
for current development status and workflow. - To build and test locally, just run:
npm run build && npm test
License
- PCRE library: BSD-style license
- Wrapper code: MIT license
Note: This wraps the original PCRE library (version 8.x), not the newer PCRE2. While PCRE2 is more modern, the original PCRE's permissive nature makes it valuable for compatibility scenarios.
// Use with full type safety const regex = new pcre.PCRERegex('\d+', 0); const result = regex.exec('Found 123 numbers', 0);
if (result.success) {
console.log(Matched: ${result.match} at position ${result.start}
);
}
regex.delete(); }
### Browser
```html
<!DOCTYPE html>
<html>
<head>
<title>PCRE WebAssembly Example</title>
</head>
<body>
<script type="module">
import PCRE from './build/libpcre.js';
async function demo() {
const pcre = await PCRE();
// Test complex regex patterns not supported by JavaScript
const result = pcre.quickMatch(
'(?P<protocol>https?)://(?P<domain>[^/]+)',
'Visit https://example.com for more info',
0
);
console.log('Parsed URL:', result);
}
demo();
</script>
</body>
</html>
API Reference
Quick Functions
quickTest(pattern: string, text: string, options: number): boolean
Returns true if the pattern matches the text.
quickMatch(pattern: string, text: string, options: number): MatchResult
Returns detailed match information including capture groups.
interface MatchResult {
success: boolean;
match?: string; // Full match text
start?: number; // Start position
end?: number; // End position
groups?: string[]; // Capture groups
}
PCRERegex Class
new PCRERegex(pattern: string, options: number)
Creates a compiled regex object.
exec(text: string, startOffset: number): MatchResult
Executes the regex against the text starting at the given offset.
delete(): void
Frees the compiled regex memory. Important for preventing memory leaks.
PCRE Options
Constant | Description |
---|---|
PCRE_CASELESS |
Case insensitive matching |
PCRE_MULTILINE |
^ and $ match newlines |
PCRE_DOTALL |
. matches newlines |
PCRE_EXTENDED |
Ignore whitespace and # comments |
PCRE_ANCHORED |
Match only at start of subject |
PCRE_UTF8 |
Enable UTF-8 mode |
PCRE_UNGREEDY |
Make quantifiers non-greedy by default |
PCRE_NO_AUTO_CAPTURE |
Disable automatic capturing |
Utility Functions
getVersionString(): string
Returns the PCRE version string.
getConfigInfo(): object
Returns PCRE build configuration information.
Advanced Features
Named Capture Groups
const regex = new pcre.PCRERegex(
'(?P<year>\\d{4})-(?P<month>\\d{2})-(?P<day>\\d{2})',
0
);
const result = regex.exec('Date: 2023-12-25', 6);
// result.groups will contain named captures
Look-ahead and Look-behind
// Positive lookbehind (not supported in JavaScript regex)
const regex = new pcre.PCRERegex('(?<=\\$)\\d+\\.\\d{2}', 0);
const result = regex.exec('Price: $19.99', 0);
Recursive Patterns
// Match balanced parentheses (impossible with JavaScript regex)
const regex = new pcre.PCRERegex('\\((?:[^()]++|(?R))*\\)', 0);
Error Handling
try {
const regex = new pcre.PCRERegex('[invalid', 0);
} catch (error) {
console.error('Compilation failed:', error.message);
}
const result = regex.exec('test', 0);
if (!result.success) {
console.log('No match found');
}
Performance Tips
- Reuse compiled regexes - Don't create new PCRERegex objects for each match
- Call delete() - Always clean up PCRERegex objects to prevent memory leaks
- Use quickTest() - For simple boolean tests, it's faster than creating objects
- Study patterns - PCRE automatically optimizes frequently used patterns
Building from Source
Requirements
- Emscripten SDK 3.1.6+
- CMake 3.16+
- Git
Automated Build & Setup
All setup, submodule, and build steps are automated:
npm run build
This will:
- Check/install required tools (git, cmake, emcc)
- Initialize submodules
- Build the WASM binary and loader
- Build ESM and CJS outputs (with Vite and TypeScript)
- Generate and copy type definitions
Manual build steps are no longer required.
License
This project combines:
- PCRE library: BSD-style license
- WebAssembly wrapper: MIT license
See the individual license files for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
Troubleshooting
Common Issues
Module loading fails in browser: Ensure you're serving files over HTTP/HTTPS, not file:// protocol.
Memory errors:
Make sure to call .delete()
on PCRERegex objects when done.
Pattern compilation fails: PCRE uses slightly different syntax than JavaScript. Check the PCRE documentation.
Getting Help
Examples
See the test/*
directory for more detailed usage examples:
- Basic pattern matching
- Complex regex features
- Performance benchmarks
- Browser integration
- Node.js applications