JSPM

  • Created
  • Published
  • Downloads 2372
  • Score
    100M100P100Q128149F
  • License MIT

Semantic duplicate pattern detection for AI-generated code - finds similar implementations that waste AI context tokens

Package Exports

  • @aiready/pattern-detect

Readme

@aiready/pattern-detect

Semantic duplicate pattern detection for AI-generated code

Finds semantically similar but syntactically different code patterns that waste AI context and confuse models.

πŸ›οΈ Architecture

                    🎯 USER
                      β”‚
                      β–Ό
            πŸŽ›οΈ  CLI (orchestrator)
                      β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                                   β”‚
    β–Ό                                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚πŸŽ¨ VIS- β”‚                        β”‚ ANALY- β”‚
β”‚UALIZER β”‚                        β”‚  SIS   β”‚
β”‚βœ… Readyβ”‚                        β”‚ SPOKES β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
    β”‚                                 β”‚
    β”‚           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚           β–Ό                     β–Ό                     β–Ό
    β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚     β”‚πŸ“Š PAT- β”‚           β”‚πŸ“¦ CON- β”‚           β”‚πŸ”§ CON- β”‚
    β”‚     β”‚TERN    β”‚           β”‚TEXT    β”‚           β”‚SISTENCYβ”‚
    β”‚     β”‚DETECT  β”‚           β”‚ANALYZERβ”‚           β”‚        β”‚
    β”‚     β”‚        β”‚           β”‚        β”‚           β”‚        β”‚
    β”‚     β”‚βœ… Readyβ”‚           β”‚βœ… Readyβ”‚           β”‚βœ… Readyβ”‚
    β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚           β”‚                     β”‚                     β”‚
    β”‚           β”‚    ← YOU ARE HERE β”€β”€β”˜                     β”‚
    β”‚           β”‚                                           β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
                  🏒 HUB (@aiready/core)

🌍 Language Support

Currently Supported (64% market coverage):

  • βœ… TypeScript (.ts, .tsx) - AST-based pattern extraction
  • βœ… JavaScript (.js, .jsx) - AST-based pattern extraction
  • βœ… Python (.py) - Function/class pattern extraction, similarity scoring

Roadmap:

  • πŸ”œ Java (Q3 2026) - Method/class patterns, Spring annotations
  • πŸ”œ Go (Q4 2026) - Function patterns, interface implementations
  • πŸ”œ Rust (Q4 2026) - Function/trait patterns, macro detection
  • πŸ”œ C# (Q1 2027) - Method/class patterns, LINQ queries

πŸš€ Quick Start

Zero config, works out of the box:

# Run without installation (recommended)
npx @aiready/pattern-detect ./src

# Or use the unified CLI (includes all AIReady tools)
npx @aiready/cli scan ./src

# Or install globally for simpler command and faster runs
npm install -g @aiready/pattern-detect
aiready-patterns ./src

🎯 Input & Output

Input: Path to your source code directory

aiready-patterns ./src

Output: Terminal report + optional JSON file (saved to .aiready/ directory)

πŸ“Š Duplicate Pattern Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“ Files analyzed: 47
⚠️  Duplicate patterns: 12 files with 23 issues
πŸ’° Wasted tokens: 8,450

CRITICAL (6 files)
  src/handlers/users.ts - 4 duplicates (1,200 tokens)
  src/handlers/posts.ts - 3 duplicates (950 tokens)

✨ Smart Defaults (Zero Config)

  • βœ… Auto-excludes test files (**/*.test.*, **/*.spec.*, **/__tests__/**)
  • βœ… Auto-excludes build outputs (dist/, build/, .next/)
  • βœ… Auto-excludes dependencies (node_modules/)
  • βœ… Adaptive threshold: Adjusts similarity detection based on codebase size
  • βœ… Pattern classification: Automatically categorizes duplicates (API handlers, validators, etc.)

Override defaults with --include-tests or --exclude <patterns> as needed

🎯 What It Does

AI tools generate similar code in different ways because they lack awareness of your codebase patterns. This tool:

  • Semantic detection: Finds functionally similar code (not just copy-paste) using Jaccard similarity on AST tokens
  • Pattern classification: Groups duplicates by type (API handlers, validators, utilities, etc.)
  • Token cost analysis: Shows wasted AI context budget
  • Refactoring guidance: Suggests specific fixes per pattern type

How It Works

The tool uses Jaccard similarity to compare code semantically:

  1. Parses TypeScript/JavaScript files into Abstract Syntax Trees (AST)
  2. Extracts semantic tokens (identifiers, operators, keywords) from each function
  3. Calculates Jaccard similarity between token sets: |A ∩ B| / |A βˆͺ B|
  4. Groups similar functions above the similarity threshold

This approach catches duplicates even when variable names or minor logic differs.

Example Output

πŸ“ Files analyzed: 47
⚠  Duplicate patterns found: 23
πŸ’° Token cost (wasted): 8,450

🌐 api-handler      12 patterns
βœ“  validator        8 patterns
πŸ”§ utility          3 patterns

1. 87% 🌐 api-handler
   src/api/users.ts:15 ↔ src/api/posts.ts:22
   432 tokens wasted
   β†’ Create generic handler function

βš™οΈ Key Options

# Basic usage
aiready patterns ./src

# Focus on obvious duplicates
aiready patterns ./src --similarity 0.9

# Include smaller patterns
aiready patterns ./src --min-lines 3

# Export results (saved to .aiready/ by default)
aiready patterns ./src --output json

# Or specify custom path
aiready patterns ./src --output json --output-file custom-report.json

πŸ“ Output Files: By default, all output files are saved to the .aiready/ directory in your project root. You can override this with --output-file.

πŸŽ›οΈ Tuning Guide

Main Parameters

Parameter Default Effect Use When
--similarity 0.4 Similarity threshold (0-1) Want more/less sensitive detection
--min-lines 5 Minimum lines per pattern Include/exclude small functions
--min-shared-tokens 8 Tokens that must match Control comparison strictness

Quick Tuning Scenarios

Want more results? (catch subtle duplicates)

# Lower similarity threshold
aiready patterns ./src --similarity 0.3

# Include smaller functions  
aiready patterns ./src --min-lines 3

# Both together
aiready patterns ./src --similarity 0.3 --min-lines 3

Want fewer but higher quality results? (focus on obvious duplicates)

# Higher similarity threshold
aiready patterns ./src --similarity 0.8

# Larger patterns only
aiready patterns ./src --min-lines 10

Analysis too slow? (optimize for speed)

# Focus on substantial functions
aiready patterns ./src --min-lines 10

# Reduce comparison candidates
aiready patterns ./src --min-shared-tokens 12

Parameter Tradeoffs

Adjustment More Results Faster Higher Quality Tradeoff
Lower --similarity βœ… ❌ ❌ More false positives
Lower --min-lines βœ… ❌ ❌ Includes trivial duplicates
Higher --similarity ❌ βœ… βœ… Misses subtle duplicates
Higher --min-lines ❌ βœ… βœ… Misses small but important patterns

Common Workflows

First run (broad discovery):

aiready patterns ./src  # Default settings

Focus on critical issues (production ready):

aiready patterns ./src --similarity 0.8 --min-lines 8

Catch everything (comprehensive audit):

aiready patterns ./src --similarity 0.3 --min-lines 3

Performance optimization (large codebases):

aiready patterns ./src --min-lines 10 --min-shared-tokens 10

πŸ“ Configuration File

Create an aiready.json or aiready.config.json file in your project root:

{
  "scan": {
    "include": ["src/**/*.{ts,tsx,js,jsx}"],
    "exclude": ["**/*.test.*", "**/dist/**"]
  },
  "tools": {
    "pattern-detect": {
      "minSimilarity": 0.6,
      "minLines": 8,
      "maxResults": 20,
      "minSharedTokens": 10,
      "maxCandidatesPerBlock": 100
    }
  },
  "output": {
    "format": "console",
    "file": ".aiready/pattern-report.json"
  }
}

Configuration Options:

Option Type Default Description
minSimilarity number 0.4 Similarity threshold (0-1)
minLines number 5 Minimum lines to consider
maxResults number 10 Max results to display in console
minSharedTokens number 8 Min tokens that must match
maxCandidatesPerBlock number 100 Performance tuning limit
approx boolean true Use approximate candidate selection
severity string 'all' Filter: 'critical', 'high', 'medium', 'all'

Use the unified CLI for all AIReady tools:

npm install -g @aiready/cli

# Pattern detection
aiready patterns ./src

# Context analysis (token costs, fragmentation)
aiready context ./src

# Consistency checking (naming, patterns)
aiready consistency ./src

# Full codebase analysis
aiready scan ./src

Related packages:

🌐 Visit Our Website

Try AIReady tools online and optimize your codebase: getaiready.dev


Made with πŸ’™ by the AIReady team | Website