Package Exports
- tmgrammar-toolkit
- tmgrammar-toolkit/cli
- tmgrammar-toolkit/cli/commands/emit
- tmgrammar-toolkit/cli/commands/index
- tmgrammar-toolkit/cli/commands/snap
- tmgrammar-toolkit/cli/commands/test
- tmgrammar-toolkit/cli/commands/validate
- tmgrammar-toolkit/cli/index
- tmgrammar-toolkit/cli/utils/bun
- tmgrammar-toolkit/cli/utils/grammar
- tmgrammar-toolkit/emit
- tmgrammar-toolkit/errors
- tmgrammar-toolkit/factory
- tmgrammar-toolkit/helpers/index
- tmgrammar-toolkit/helpers/regex
- tmgrammar-toolkit/index
- tmgrammar-toolkit/package.json
- tmgrammar-toolkit/result
- tmgrammar-toolkit/scopes/comment
- tmgrammar-toolkit/scopes/constant
- tmgrammar-toolkit/scopes/entity
- tmgrammar-toolkit/scopes/index
- tmgrammar-toolkit/scopes/invalid
- tmgrammar-toolkit/scopes/keyword
- tmgrammar-toolkit/scopes/lib/internal
- tmgrammar-toolkit/scopes/markup
- tmgrammar-toolkit/scopes/meta
- tmgrammar-toolkit/scopes/punctuation
- tmgrammar-toolkit/scopes/storage
- tmgrammar-toolkit/scopes/string
- tmgrammar-toolkit/scopes/support
- tmgrammar-toolkit/scopes/types
- tmgrammar-toolkit/scopes/variable
- tmgrammar-toolkit/scopes/well-known/comment
- tmgrammar-toolkit/scopes/well-known/constant
- tmgrammar-toolkit/scopes/well-known/entity
- tmgrammar-toolkit/scopes/well-known/index
- tmgrammar-toolkit/scopes/well-known/invalid
- tmgrammar-toolkit/scopes/well-known/keyword
- tmgrammar-toolkit/scopes/well-known/markup
- tmgrammar-toolkit/scopes/well-known/meta
- tmgrammar-toolkit/scopes/well-known/punctuation
- tmgrammar-toolkit/scopes/well-known/storage
- tmgrammar-toolkit/scopes/well-known/string
- tmgrammar-toolkit/scopes/well-known/support
- tmgrammar-toolkit/scopes/well-known/variable
- tmgrammar-toolkit/terminals/chars
- tmgrammar-toolkit/terminals/comments
- tmgrammar-toolkit/terminals/identifiers
- tmgrammar-toolkit/terminals/index
- tmgrammar-toolkit/terminals/markers
- tmgrammar-toolkit/terminals/numbers
- tmgrammar-toolkit/terminals/operators
- tmgrammar-toolkit/terminals/strings
- tmgrammar-toolkit/terminals/whitespace
- tmgrammar-toolkit/testing/helpers
- tmgrammar-toolkit/testing/index
- tmgrammar-toolkit/testing/programmatic
- tmgrammar-toolkit/testing/types
- tmgrammar-toolkit/types
- tmgrammar-toolkit/utils/file
- tmgrammar-toolkit/validation/grammar
- tmgrammar-toolkit/validation/index
- tmgrammar-toolkit/validation/regex
- tmgrammar-toolkit/validation/scope
Readme
TextMate Toolkit
A modern, type-safe toolkit for creating TextMate grammars with TypeScript. Build syntax highlighting grammars with confidence using comprehensive validation, testing, and development tools.
Why TextMate Toolkit?
Creating TextMate grammars manually means wrestling with massive JSON files, regex patterns, and scope naming without any safety net. This toolkit transforms that experience:
❌ Before: Manual Grammar Development
{
"patterns": [
{
"match": "\\b(if|else|while)\\b",
"name": "keyword.control.mylang"
}
]
}
✅ After: Type-Safe Grammar Development
import { createGrammar, scopes, regex } from 'tmgrammar-toolkit';
const keywordRule = {
key: 'keywords',
match: regex.keywords(['if', 'else', 'while']),
scope: scopes.keyword.control('mylang')
};
export const grammar = createGrammar('MyLang', 'source.mylang', ['mylang'], [keywordRule]);
✨ Features
- 🎯 Type-Safe Development - Full TypeScript support with comprehensive type definitions
- 🧪 Integrated Testing - Programmatic and declarative testing with snapshot support
- ✅ Built-in Validation - Catch regex errors and scope naming issues before deployment
- 🔄 Multiple Output Formats - Generate JSON or Plist grammar files
- 🛠️ Powerful CLI - Unified command-line interface for all operations
- 📚 Rich Ecosystem - Pre-built patterns, helpers, and comprehensive documentation
- 🚀 Performance Optimized - Smart repository management and efficient pattern generation
🚀 Quick Start
Installation
# Using Bun (recommended)
bun add tmgrammar-toolkit
# Using npm
npm install tmgrammar-toolkit
# Using yarn
yarn add tmgrammar-toolkit
Your First Grammar (5 minutes)
Create a simple grammar for a language called "MyLang":
// mylang-grammar.ts
import { createGrammar, scopes, regex } from 'tmgrammar-toolkit';
import { COMMENT, NUM } from 'tmgrammar-toolkit/terminals';
const keywordRule = {
key: 'keywords',
match: regex.keywords(['if', 'else', 'while', 'function']),
scope: scopes.keyword.control('mylang')
};
const commentRule = {
key: 'comment',
match: regex.concat('//', '.*$'),
scope: scopes.comment.line['double-slash']('mylang')
};
const numberRule = {
key: 'number',
match: NUM.DEC,
scope: scopes.constant.numeric('mylang')
};
export const myLangGrammar = createGrammar(
'MyLang',
'source.mylang',
['mylang'],
[keywordRule, commentRule, numberRule]
);
Generate the grammar file:
# Using CLI with Bun (works directly with TypeScript!)
bunx tmt emit mylang-grammar.ts -o mylang.tmLanguage.json
# Or with npm
npx tmt emit mylang-grammar.ts -o mylang.tmLanguage.json
# Or programmatically
import { emitJSON } from 'tmgrammar-toolkit';
import { myLangGrammar } from './mylang-grammar.js';
const grammarJson = await emitJSON(myLangGrammar);
console.log(grammarJson);
🏗️ Architecture
The toolkit is organized into focused modules that work together seamlessly:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ TypeScript │───▶│ Validation │───▶│ Testing │
│ Grammar │ │ & Helpers │ │ & Debugging │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Factory & API │───▶│ Type-Safe API │───▶│ TextMate JSON │
│ Functions │ │ Scopes │ │ Grammar │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Core Modules:
- 🏭 Factory - Clean APIs for creating grammars and rules (
factory.ts
) - 🎯 Scopes - Type-safe scope management with autocomplete (
scopes/
) - 🧩 Terminals - Pre-built patterns for common language constructs (
terminals/
) - 🛠️ Helpers - Regex construction utilities (
helpers/
) - ✅ Validation - Pattern and grammar validation (
validation/
) - 🧪 Testing - Comprehensive testing framework (
testing/
) - 📤 Emission - Multi-format grammar generation (
emit.ts
)
💡 Key Concepts
Type-Safe Rules
Everything in a TextMate grammar is a Rule. We support three types with full TypeScript safety:
// Match Rule - for simple patterns
const keywordRule: MatchRule = {
key: 'keywords',
match: regex.keywords(['if', 'else']),
scope: scopes.keyword.control('mylang')
};
// Begin/End Rule - for multi-line constructs
const stringRule: BeginEndRule = {
key: 'string',
begin: /"/,
end: /"/,
scope: scopes.string.quoted.double('mylang'),
patterns: [escapeRule]
};
// Include Rule - for pattern composition
const expressionRule: IncludeRule = {
key: 'expression',
patterns: [numberRule, stringRule, identifierRule]
};
Intelligent Repository Management
No more manual repository management! The toolkit automatically:
- ✅ Builds repository from rule keys
- ✅ Detects duplicate keys and circular references
- ✅ Optimizes pattern organization
- ✅ Generates clean, efficient JSON
Comprehensive Testing
Write tests that actually verify your grammar works:
import { createTesterFromContent } from 'tmgrammar-toolkit/testing';
const tester = createTesterFromContent(myGrammar, 'source.mylang');
const tokens = await tester.tokenize('if (condition) { /* comment */ }');
// Verify specific token scopes
tester.expectTokenScope(tokens, 'if', 'keyword.control.mylang');
tester.expectTokenScope(tokens, '/*', 'comment.block.mylang');
🛠️ CLI Commands
The tmt
command provides unified access to all toolkit functionality. Works seamlessly with Bun for direct TypeScript execution without build steps!
Grammar Generation
# Generate JSON (default)
tmt emit my-grammar.ts
# Generate Plist format
# Note: The CLI does not currently support a --plist flag.
# This must be done programmatically.
# See programmatic example below.
# Emit specific export
tmt emit my-grammar.ts myGrammarExport
Testing & Validation
# Run declarative tests
tmt test 'tests/**/*.test.mylang' -g grammar.json
# Generate/update snapshots
tmt snap 'tests/**/*.mylang' --update
# Validate grammar files
tmt validate grammar.json
tmt validate my-grammar.ts
📚 Rich Pattern Library
Pre-built patterns for common language constructs:
import { COMMENT, NUM, ID, OP, STRING } from 'tmgrammar-toolkit/terminals';
// Comments
COMMENT.SLASHES // "//"
COMMENT.HASH // "#"
COMMENT.BLOCK.START // "/*"
// Numbers
NUM.DEC // Decimal: 123, 123.45
NUM.HEX // Hex: 0xFF, 0xABCD
NUM.BIN // Binary: 0b1010
NUM.OCT // Octal: 0o777
// Identifiers
ID // Standard: [a-zA-Z_][a-zA-Z0-9_]*
CAMEL_CASE_ID // camelCase
SNAKE_CASE_ID // snake_case
PASCAL_CASE_ID // PascalCase
// Operators
OP.ASSIGNMENT // =, +=, -=, *=, /=
OP.COMPARISON // ==, !=, <, >, <=, >=
OP.ARITHMETIC // +, -, *, /, %
OP.LOGICAL // &&, ||, !
🎯 Type-Safe Scope System
The toolkit features a revolutionary scope management system that eliminates string-based errors and provides excellent developer experience:
Basic Scopes
import { scopes } from 'tmgrammar-toolkit';
// Type-safe scope access with full autocomplete
scopes.keyword.control.conditional // "keyword.control.conditional"
scopes.string.quoted.double // "string.quoted.double"
scopes.entity.name.function // "entity.name.function"
// Automatic string conversion in templates
const rule = `${scopes.comment.line.double_slash}`; // "comment.line.double-slash"
// Call with language suffix
scopes.keyword.control.conditional('js'); // "keyword.control.conditional.js"
Language-Specific Scopes
import { scopesFor } from 'tmgrammar-toolkit';
// Create language-bound scopes
const jsScopes = scopesFor('js');
jsScopes.keyword.control.conditional; // "keyword.control.conditional.js"
jsScopes.string.quoted.double; // "string.quoted.double.js"
// Still callable for additional suffixes
jsScopes.keyword.control.conditional('async'); // "keyword.control.conditional.js.async"
Extended Custom Scopes
// Advanced: Custom scopes with extensions
const rclScopes = scopesFor({
suffix: 'rcl',
allowScopeExtension: true
}, {
meta: {
section: {
agent: {
definition: 'Agent definition blocks',
name: 'Agent name identifiers'
},
message: {
definition: 'Message definition blocks',
shortcut: 'Message shortcut syntax'
}
}
},
punctuation: {
definition: {
flow: {
arrow: 'Flow control arrows (->)'
}
}
}
});
// Use both predefined and custom scopes
rclScopes.keyword.control.conditional; // "keyword.control.conditional.rcl"
rclScopes.meta.section.agent.definition; // "meta.section.agent.definition.rcl"
rclScopes.punctuation.definition.flow.arrow; // "punctuation.definition.flow.arrow.rcl"
Extension Modes
Control how custom scopes are merged with three modes:
// No extensions - only language suffix
const basicScopes = scopesFor('lang');
// Extensions everywhere
const fullExtensions = scopesFor({
suffix: 'lang',
allowScopeExtension: true
}, customScopes);
// Extensions only on terminal/leaf scopes
const leafExtensions = scopesFor({
suffix: 'lang',
allowScopeExtension: "on-leafs"
}, customScopes);
Multiple Naming Conventions
Access scopes using your preferred style:
// All equivalent - use your preferred convention:
scopes.comment.line['double-slash'] // Kebab-case (TextMate standard)
scopes.comment.line.doubleSlash // camelCase
scopes.comment.line.double_slash // snake_case
// All produce: "comment.line.double-slash"
🎯 Real-World Examples
The toolkit includes complete, production-ready grammar implementations:
Simple Language Example
// Using language-specific scopes for consistency
const simpleScopes = scopesFor('simple');
const simpleGrammar = createGrammar('Simple', 'source.simple', ['simple'], [
{ key: 'keywords', match: regex.keywords(['if', 'else']), scope: simpleScopes.keyword.control },
{ key: 'strings', begin: /"/, end: /"/, scope: simpleScopes.string.quoted.double },
{ key: 'numbers', match: NUM.DEC, scope: simpleScopes.constant.numeric }
]);
Advanced: RCL Language with Custom Scopes
// Custom domain-specific scopes for RCL language
const rclScopes = scopesFor({ suffix: 'rcl', allowScopeExtension: true }, {
meta: {
section: {
agent: { definition: null, name: null },
message: { definition: null, shortcut: null },
flow: { rule: null, condition: null }
},
embedded: {
expression: { single: null, block: null }
}
},
punctuation: {
definition: {
flow: { arrow: null },
embedded: { marker: null }
}
}
});
const rclGrammar = createGrammar('RCL', 'source.rcl', ['rcl'], [
{
key: 'agent-section',
begin: /^(\s*)(agent)(\s*)(:)/,
beginCaptures: {
2: { name: rclScopes.keyword.declaration },
4: { name: rclScopes.punctuation.separator }
},
name: rclScopes.meta.section.agent.definition,
patterns: [
{
name: rclScopes.meta.section.agent.name,
match: /[A-Z][a-zA-Z0-9_\s-]*/
}
]
},
{
key: 'flow-arrow',
name: rclScopes.punctuation.definition.flow.arrow,
match: /->/
},
{
key: 'embedded-js',
name: rclScopes.meta.embedded.expression.single,
begin: /\$js>/,
beginCaptures: {
0: { name: rclScopes.punctuation.definition.embedded.marker }
},
end: /$/
}
]);
Find complete examples in the examples/
directory.
📖 Documentation
Comprehensive documentation covering all aspects of the toolkit:
- 📋 Getting Started Guide - Build your first grammar in 15 minutes
- 🏗️ Architecture Overview - Deep dive into toolkit design
- 📚 API Reference - Complete function and type reference
- 🎯 Scopes Guide - Master type-safe scope management
- 🔧 Troubleshooting - Solutions for common issues
- 📝 TextMate Scopes Reference - Complete scope naming guide
🧪 Testing Your Grammars
Programmatic Testing
import { createTesterFromContent } from 'tmgrammar-toolkit/testing';
const tester = createTesterFromContent(grammar, 'source.mylang');
// Test tokenization
const tokens = await tester.tokenize('function hello() { return "world"; }');
// Assertions
tester.expectTokenScope(tokens, 'function', 'keyword.declaration.mylang');
tester.expectTokenScope(tokens, 'hello', 'entity.name.function.mylang');
tester.expectTokenScope(tokens, '"world"', 'string.quoted.double.mylang');
Declarative Testing
Create test files with embedded scope assertions:
// test.mylang
function hello() {
// <- keyword.declaration.mylang
// ^ entity.name.function.mylang
return "world";
// ^ string.quoted.double.mylang
}
Run tests:
npx tmt test 'tests/**/*.test.mylang' -g mylang.tmLanguage.json
⚡ Performance Features
- Smart Pattern Optimization - Automatic pattern combining and ordering
- Efficient Repository Generation - Minimal JSON output with optimal structure
- Regex Validation - Catch expensive patterns before they cause performance issues
- Atomic Grouping Support - Use non-backtracking patterns for better performance
- Memory-Efficient Processing - Stream-based grammar generation for large grammars
🎨 Editor Integration
Works seamlessly with all editors supporting TextMate grammars:
- VS Code - First-class support with extension development tools
- Sublime Text - Direct .tmLanguage file support
- Atom - Grammar package integration
- Vim/Neovim - Via TreeSitter and syntax plugins
- Emacs - Multiple TextMate grammar packages
- Any Editor - Standard TextMate grammar format
🚀 Migration from Manual Grammars
Already have a TextMate grammar? The toolkit makes migration straightforward:
- Convert patterns using our regex helpers
- Replace scope strings with type-safe scope API
- Add validation to catch existing issues
- Write tests to prevent regressions
- Optimize patterns using our performance tools
🗺️ Roadmap
The future of tmgrammar-toolkit
is focused on enhancing the developer experience, expanding authoring capabilities, and integrating with the broader ecosystem. Here's a look at what's planned for upcoming versions.
Version 2.0: Polishing the Core Experience
The primary goal of v2.0 is to solidify the existing feature set and improve the overall authoring workflow.
- CLI Enhancements: A refactored and more maintainable CLI architecture. Full implementation of all documented flags, including
--plist
and other potential emission formats. - Advanced Authoring: A fluent API for building grammars and improved repository management.
- Improved Testing: Richer test reporting and performance/complexity analysis for grammars.
- Test Runner Integration: Adapters for seamless integration with Jest and Vitest.
Version 3.0: Advanced Tooling and Ecosystem Integration
Version 3.0 will introduce advanced tools to support large-scale grammar development and deeper integration with the TextMate ecosystem.
- Developer Experience: A development server with hot-reloading, project scaffolding, and a dedicated VS Code extension.
- Advanced Composition: Support for grammar inheritance and composition to manage complex language definitions.
- Enhanced Quality Assurance: Test coverage analysis and theme compatibility checks.
- Ecosystem Integration: Tools for converting, optimizing, and sharing grammars.
For a detailed breakdown of our future plans, see the v2.0 and v3.0 roadmaps.
🤝 Contributing
We welcome contributions! Whether you're:
- 🐛 Reporting bugs - Help us improve stability
- 💡 Suggesting features - Share ideas for new functionality
- 📝 Improving docs - Make the toolkit more accessible
- 🧩 Adding patterns - Contribute to the terminal library
- 🔧 Fixing issues - Submit pull requests
See our contributing guidelines for details.
📦 Credits and Related Projects
This toolkit builds upon and integrates with excellent open-source projects:
- vscode-tmgrammar-test - Grammar testing framework
- vscode-textmate - TextMate grammar parsing
- oniguruma - Regular expression engine
- VS Code Language Extensions - Editor integration
We also vendor the TMLanguage JSON schema, originally created by Martin Ring. You can find the original project here.
We are grateful for the work of everyone that contributed to the projects above!
📄 License
MIT License - see LICENSE for details.
🌟 Why Choose TextMate Toolkit?
For Individual Developers:
- ⚡ Faster Development - Type safety and helpers accelerate grammar creation
- 🐛 Fewer Bugs - Validation catches issues early in development
- 📚 Better Documentation - Comprehensive guides and examples
- 🔧 Easier Debugging - Testing tools help isolate and fix issues
For Teams:
- 🤝 Consistent Standards - Shared patterns and conventions
- 👥 Better Collaboration - Code review-friendly TypeScript
- 📈 Scalable Approach - Reusable components and patterns
- ✅ Quality Assurance - Automated testing and validation
For the Ecosystem:
- 🌍 Open Source - MIT license encourages adoption and contribution
- 🔄 Interoperable - Standard TextMate format works everywhere
- 📊 Well-Tested - Comprehensive test suite ensures reliability
- 🚀 Modern Tooling - Built with current TypeScript and testing best practices
Ready to build better syntax highlighting?