JSPM

  • Created
  • Published
  • Downloads 24
  • Score
    100M100P100Q63377F
  • License MIT

CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis

Package Exports

  • codemodctl
  • codemodctl/codeowners
  • codemodctl/sharding

Readme

codemodctl

CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis.

Installation

npm install codemodctl

Usage

As a CLI Tool

# Analyze CODEOWNERS and generate sharding configuration
codemodctl codeowner --shard-size 20 --state-prop shards --rule ./rule.yaml

As a Library

Deterministic File Sharding

import { getShardForFilename, fitsInShard, distributeFilesAcrossShards } from 'codemodctl/sharding';

// Get the shard index for a specific file - always deterministic!
const shardIndex = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });

// Same file + same shard count = same result, every time
const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
console.log(shard1 === shard2); // always true

// Check if a file belongs to a specific shard
const belongsToShard = fitsInShard('src/components/Button.tsx', { 
  shardCount: 5, 
  shardIndex: 2 
});

// Distribute all files across shards with consistent hashing
const files = ['file1.ts', 'file2.ts', 'file3.ts'];
const distribution = distributeFilesAcrossShards(files, 5);

// Check scaling behavior - minimal reassignment when growing
const scalingAnalysis = analyzeShardScaling(files, 5, 6);
console.log(`${scalingAnalysis.stableFiles} files stay in same shard`);
console.log(`${scalingAnalysis.reassignmentPercentage}% reassignment`); // Much less than 100%

Codeowner Analysis

import { analyzeCodeowners, findCodeownersFile } from 'codemodctl/codeowners';

// Analyze codeowners and generate shard configuration
const result = await analyzeCodeowners({
  shardSize: 20,
  rulePath: './rule.yaml',
  projectRoot: process.cwd()
});

console.log(`Generated ${result.shards.length} shards for ${result.totalFiles} files`);
result.teams.forEach(team => {
  console.log(`Team "${team.team}" owns ${team.fileCount} files`);
});

Complete API

import codemodctl from 'codemodctl';

// Access all utilities through the default export
const shardIndex = await codemodctl.sharding.getShardForFilename('file.ts', { shardCount: 5 });
const analysis = await codemodctl.codeowners.analyzeCodeowners(options);

Key Features

Consistent File Sharding

The sharding algorithm uses consistent hashing to ensure:

  • Perfect consistency: Same file + same shard count = same result, always
  • No external dependencies: Result depends only on filename and shard count
  • Minimal reassignment: When scaling up, only ~20-40% of files move (not 100%)
  • Stable scaling: Adding new shards doesn't reorganize existing file assignments
  • Simple API: No complex parameters or configuration needed
  • Team-aware sharding: Works with codeowner boundaries

Codeowner Analysis

  • Automatic CODEOWNERS detection: Searches common locations (root, .github/, docs/)
  • AST-grep integration: Analyze files using custom rules
  • Team-based grouping: Groups files by their assigned teams
  • Shard generation: Creates optimal shard configuration based on team ownership

API Reference

Sharding Functions

  • getShardForFilename(filename, { shardCount }) - Get shard index for a file
  • fitsInShard(filename, { shardCount, shardIndex }) - Check shard membership
  • distributeFilesAcrossShards(files, shardCount) - Distribute files across shards
  • calculateOptimalShardCount(totalFiles, targetShardSize) - Calculate optimal shard count
  • getFileHashPosition(filename) - Get consistent hash position for a file
  • analyzeShardScaling(files, oldCount, newCount) - Analyze reassignment when scaling

All functions are deterministic: same input always produces the same output.

Scaling behavior: When going from N to N+1 shards, typically only 20-40% of files get reassigned to new locations, making it ideal for incremental scaling scenarios.

Codeowner Functions

  • analyzeCodeowners(options) - Complete analysis with shard generation
  • findCodeownersFile(projectRoot?, explicitPath?) - Locate CODEOWNERS file
  • loadAstGrepRule(rulePath) - Parse AST-grep rule from YAML
  • analyzeFilesByOwner(codeownersPath, rule, projectRoot?) - Group files by owner
  • generateShards(filesByOwner, shardSize) - Generate shard configuration
  • normalizeOwnerName(owner) - Normalize owner names

Usage Examples

Simple Deterministic Sharding

import { getShardForFilename, distributeFilesAcrossShards } from 'codemodctl/sharding';

// Get shard for a file - always deterministic
const shard = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });

// Same input always gives same output
const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
console.log(shard1 === shard2); // always true

// Different shard counts give different results (expected behavior)
const shard5 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard10 = getShardForFilename('src/components/Button.tsx', { shardCount: 10 });
// shard5 and shard10 will likely be different, but each is consistent

// Distribute files with consistent hashing for stable scaling
const files = ['file1.ts', 'file2.ts', 'file3.ts'];
const distribution = distributeFilesAcrossShards(files, 5);

// When you need more capacity, most files stay in place
const moreFiles = [...files, 'newFile.ts'];
const analysis = analyzeShardScaling(files, 5, 6);
// Only ~20-40% of files get reassigned, not all of them!

Key Benefits

  • No complex parameters: Just filename and shard count
  • Perfectly deterministic: Same input = same output, always
  • Stable scaling: When adding shards, most files stay in their original shards
  • Minimal reassignment: Only ~20-40% of files move when scaling up
  • Fast and simple: Hash-based assignment with consistent ring placement
  • Works across runs: File gets same shard whether filesystem changes or not

CLI Commands

codeowner

Analyze CODEOWNERS file and generate sharding configuration.

codemodctl codeowner [options]

Options:
  -s, --shard-size <size>     Number of files per shard (required)
  -p, --state-prop <prop>     Property name for state output (required)  
  -c, --codeowners <path>     Path to CODEOWNERS file (optional)
  -r, --rule <path>           Path to AST-grep rule file (required)

Environment variables:

  • STATE_OUTPUTS: Path to write state output file

License

MIT