JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 1535
  • Score
    100M100P100Q114402F
  • License MIT

CLI + MCP sentinel for engineering standards — SOLID, testing, architecture, CI/CD — auto-tailored to your stack. Minimal MCP footprint (~200 tokens) via CLI-first design.

Package Exports

  • forgecraft-mcp
  • forgecraft-mcp/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (forgecraft-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

ForgeCraft

The quality contract your AI coding assistant works within.

npm version license downloads


You hired an AI engineer. It's brilliant. It also installed the same 14 VS Code extensions twice today, spun up 6 Docker containers it will never clean up, and your disk went from 12 GB free to 0 KB in one session.

A full disk doesn't fail gracefully. It kills VS Code, the terminal, Docker, and the database simultaneously.

ForgeCraft is the quality contract your AI coding assistant works within — so it builds fast and doesn't burn down the house.

npx forgecraft-mcp setup .

Supports: Claude (CLAUDE.md) · Cursor (.cursor/rules/) · GitHub Copilot (.github/copilot-instructions.md) · Windsurf (.windsurfrules) · Cline (.clinerules) · Aider (CONVENTIONS.md)


A quality framework for AI-assisted software development

Every session, every project, every AI assistant — measured against the same 7-property Generative Specification model. Not vibes. Not a linter score. A score out of 14 that tells you exactly where the gap is and why.

$ npx forgecraft-mcp verify .

| Property        | Score | Evidence                                        |
|-----------------|-------|-------------------------------------------------|
| Self-Describing | ✅ 2/2 | CLAUDE.md — 352 non-empty lines                |
| Bounded         | ✅ 2/2 | No direct DB calls in route files              |
| Verifiable      | ✅ 2/2 | 64 test files — 87% coverage                   |
| Defended        | ✅ 2/2 | Pre-commit hook + lint config present           |
| Auditable       | ✅ 2/2 | 11 ADRs in docs/adrs/ + Status.md              |
| Composable      | ✅ 2/2 | Service layer + repository layer detected       |
| Executable      | ✅ 2/2 | Tests passed + CI pipeline configured           |

Total: 14/14 ✅ PASS · Threshold 11/14
Property What it checks
Self-Describing Does the codebase explain itself without you?
Bounded Is business logic leaking into your routes?
Verifiable Are there tests, and did they pass in a real runtime?
Defended Are hooks blocking bad commits before they land?
Auditable Is every architectural decision recorded and findable?
Composable Can you swap the database without touching the domain?
Executable Is there CI evidence this thing actually ran?

Dev environment hygiene — enforced by convention

ForgeCraft injects enforceable rules into every project's AI instructions that make environment pollution a convention violation, not an incident.

VS Code extensions Before installing: code --list-extensions | grep -i <name>. Only install if no version in the required major range is already present. The same extension doesn't get downloaded twice in the same day.

Docker containers Check before creating: docker ps -a --filter name=<service>. If it exists, start it — don't create it. Prefer docker compose up (reuse) over bare docker run (always creates new). Logs capped at 500 MB. docker system prune -f is documented as a periodic maintenance step, not an emergency.

Exception: Multiple containers of the same service are permitted when they differ meaningfully in plugin set or major version — for example, a postgres-pgvector container alongside a standard postgres container. Name containers to reflect the variant (e.g., db-pgvector, db-timescale); otherwise the deduplication rule applies.

Python virtual environments One .venv per project root. Reuse if the Python major.minor version matches. Never create a venv in a subdirectory unless it's a standalone installable package. Unused dependencies flagged by pip list --not-required.

Synthetic and time-series data Before writing more than 100 MB of generated data, the AI asks: retain raw, condense statistically, or delete after the run? Synthetic datasets older than 7 days with no code reference: ask to delete.

General If the workspace grows beyond 2 GB outside of known build artifacts (node_modules/, .venv/, dist/), surface a warning and stop. Never silently grow the workspace.


Project setup in one sentence

Read the spec in docs/specs/, set up this project with ForgeCraft,
scaffold it with the right tags, recommend the tech stack, start building.

That's the entire onboarding prompt. ForgeCraft reads the spec, the AI assigns the tags, and ForgeCraft writes the instruction file, emits Status.md, docs/adrs/, docs/PRD.md, docs/TechSpec.md, hooks, and skills. The AI has full context. You start building.

ForgeCraft scans your project, auto-detects your stack, and generates tailored instruction files from 116 curated blocks — SOLID, hexagonal architecture, testing pyramids, CI/CD, and 24 domain-specific rule sets — in seconds.


Quality gates

Quality gates are structured pass/fail checks your AI assistant runs at defined moments — before a commit, before a release, after a deployment. They're not linter rules. Each gate has a condition, an evidence requirement, and a flag for whether human review is mandatory.

Gates are organized by release phase so you're not running pre-release chaos tests on day one of a greenfield project:

Phase Example gates
development Unit tests pass · lint clean · no layer violations · no hardcoded secrets
pre-release hardening Mutation testing ≥80% · DAST scan · 2× peak load · chaos (Toxiproxy)
release candidate OWASP Top 10 pentest · full mutation audit · compatibility matrix · accessibility
deployment Canary config verified · smoke tests pass · observability confirmed
post-deployment Synthetic probes live · 30-min error window monitored · incident runbook reviewed

Gates tagged requires_human_review: true cannot be auto-passed — some checks require a human.

The full gate library, contribution guide, and schema are in the quality gates repository →


ADRs, automatically sequenced

Every non-obvious architectural decision gets recorded. ForgeCraft auto-sequences docs/adrs/NNNN-slug.md in MADR format — context, decision, alternatives, consequences. Your AI assistant reasons about past choices. Your team stops re-litigating them.

npx forgecraft-mcp generate_adr . --title "Use event sourcing for order history" \
  --status Accepted \
  --context "Order mutations need full audit trail for compliance" \
  --decision "Append-only event log, project current state on read"
# → docs/adrs/0004-use-event-sourcing-for-order-history.md

AI assistant setup vs ForgeCraft

claude init, Cursor's workspace rules, or Copilot's instructions file get you started. ForgeCraft gets you to production standards — across every AI assistant, every session, every engineer on the team.

Default AI setup ForgeCraft
Instruction file Generic, one-size-fits-all 116 curated blocks matched to your stack
AI assistants Varies by tool Claude, Cursor, Copilot, Windsurf, Cline, Aider
Architecture None SOLID, hexagonal, clean code, DDD
Testing Basic mention Testing pyramid, coverage targets, mutation gates
Domain rules None 24 domains (fintech, healthcare, gaming…)
Quality score None GS score out of 14 — know exactly where the gap is
Release phases None 7 phases from development through post-deployment
Dev hygiene None VS Code, Docker, Python venv, disk guard
ADRs None Auto-sequenced, MADR format
Session continuity None Status.md + forgecraft.yaml persist context
Drift detection None refresh detects scope changes

Workflow Playbook

After setup, your AI has the context. These prompts direct the work. Copy, paste, run.

Situation Prompt
New project — scaffold structure Greenfield Setup
Existing project — integrate ForgeCraft Brownfield Integration
Audit shows file_length failures Decompose by responsibility
Audit shows hardcoded_url failures Extract to env vars
Audit shows hardcoded_credential failures Remove secrets — do this first
Audit shows layer_violation failures Fix route → DB direct calls
Audit shows mock_in_source failures Move mocks out of production
Audit shows missing_prd failures Reverse-engineer spec docs
Audit shows stale_status failures Update Status.md
Score ≥ 80 and preparing to ship Pre-release hardening
Just deployed to production Post-deployment checklist
Project scope changed Drift detection

Full Workflow Playbook · Online version


How It Works

# First-time setup — auto-detects your stack
npx forgecraft-mcp setup .
flowchart TD
    A["<b>setup .</b><br/>npx forgecraft-mcp setup ."] --> B["Phase 1 — Analyze<br/>Reads spec · infers tags"]
    B --> C{AI assistant\nin the loop?}
    C -->|"Yes (MCP)"| D["Phase 2 — Calibrate<br/>LLM corrects tags from spec<br/>Writes forgecraft.yaml · CLAUDE.md<br/>PRD.md · hooks · ADR-000"]
    C -->|"No (CLI only)"| E["⚠️ CLI-only mode<br/>Directory heuristics only<br/>→ configure an AI assistant"]
    D --> F["<b>check_cascade</b><br/>5-step readiness gate<br/>1 · Functional spec<br/>2 · Architecture + C4<br/>3 · Constitution<br/>4 · ADRs<br/>5 · Use cases"]
    F --> G{All 5 passing?}
    G -->|"Stubs / missing"| H["Fill artifacts<br/>docs/PRD.md · docs/adrs/<br/>docs/use-cases.md"]
    H --> F
    G -->|"✅ All pass"| I["<b>generate_session_prompt</b><br/>Bound context for next task"]
    I --> J["Implement with TDD<br/>RED → GREEN → REFACTOR<br/>+ Documentation Cascade"]
    J --> K["<b>audit_project</b><br/>Score 0 – 100"]
    K --> L{Score ≥ 90?}
    L -->|"Violations found"| M["WORKFLOWS.md remediation<br/>file_length · layer_violation<br/>hardcoded_url · missing_prd"]
    M --> J
    L -->|"✅ Score ≥ 90"| N["<b>close_cycle</b><br/>Re-check cascade · assess gates<br/>promote to registry · bump version"]
    N --> O{Roadmap\ncomplete?}
    O -->|"More features"| I
    O -->|"All done"| P["<b>start_hardening</b><br/>Mutation tests · OWASP · load test"]
    P --> Q["🚢 Ship"]

    style A fill:#1a2e1a,color:#90ee90,stroke:#3a6e3a
    style Q fill:#1a2a3e,color:#87ceeb,stroke:#3a5a8e
    style E fill:#2e1a1a,color:#ffaa88,stroke:#6e3a3a
    style M fill:#2e2a00,color:#ffd700,stroke:#6e6000

ForgeCraft is a setup-time CLI tool. Run it once to configure your project, then remove it — it has no runtime footprint.

Optionally add the MCP sentinel to let your AI assistant diagnose and recommend commands:

claude mcp add forgecraft -- npx -y forgecraft-mcp

The sentinel is a single tool (~200 tokens). It reads three artifacts — forgecraft.yaml, CLAUDE.md, .claude/hooks — derives the correct next CLI command, and returns it. Nothing more. This is the methodology's core principle expressed as tool design: a stateless reader, a finite artifact set, a derived action. Remove it after initial setup to reclaim token budget.

What You Get

After npx forgecraft-mcp setup, your project has:

your-project/
├── forgecraft.yaml        ← Your config (tags, tier, customizations)
├── CLAUDE.md              ← Engineering standards (Claude)
├── .cursor/rules/         ← Engineering standards (Cursor)
├── .github/copilot-instructions.md  ← Engineering standards (Copilot)
├── Status.md              ← Session continuity tracker
├── .claude/hooks/         ← Pre-commit quality gates
├── docs/
│   ├── PRD.md             ← Requirements skeleton
│   └── TechSpec.md        ← Architecture + NFR sections
└── src/shared/            ← Config, errors, logger starters

The Instruction Files

This is the core value. Assembled from curated blocks covering:

  • SOLID principles — concrete rules, not platitudes
  • Hexagonal architecture — ports, adapters, DTOs, layer boundaries
  • Testing pyramid — unit/integration/E2E targets, test doubles taxonomy
  • Clean code — CQS, guard clauses, immutability, pure functions
  • CI/CD & deployment — pipeline stages, environments, preview deploys
  • Domain patterns — DDD, CQRS, event sourcing (when your project needs it)
  • 12-Factor ops — config, statelessness, disposability, logging

Every block is sourced from established engineering literature (Martin, Evans, Wiggins) and adapted for AI-assisted development.

24 Tags — AI-detected, user-adjustable

Tags tell ForgeCraft what your project is. On first setup, the AI analyzes your spec and codebase and assigns them. You can review and override in forgecraft.yaml. Blocks merge without conflicts — add or remove tags as the project evolves.

The full tag list and contribution guide live in the quality gates repository →

Tag What it adds
UNIVERSAL SOLID, testing, commits, error handling (always on)
API REST/GraphQL contracts, auth, rate limiting, versioning
WEB-REACT Component arch, state management, a11y, perf budgets
WEB-STATIC Build optimization, SEO, CDN, static deploy
CLI Arg parsing, output formatting, exit codes
LIBRARY API design, semver, backwards compatibility
INFRA Terraform/CDK, Kubernetes, secrets management
DATA-PIPELINE ETL, idempotency, checkpointing, schema evolution
ML Experiment tracking, model versioning, reproducibility
FINTECH Double-entry accounting, decimal precision, compliance
HEALTHCARE HIPAA, PHI handling, audit logs, encryption
MOBILE React Native/Flutter, offline-first, native APIs
REALTIME WebSockets, presence, conflict resolution
GAME Game loop, ECS, Phaser 3, PixiJS, Three.js/WebGL, performance budgets
SOCIAL Feeds, connections, messaging, moderation
ANALYTICS Event tracking, dashboards, data warehousing
STATE-MACHINE Transitions, guards, event-driven workflows
WEB3 Smart contracts, gas optimization, wallet security
HIPAA PII masking, encryption checks, audit logging
SOC2 Access control, change management, incident response
DATA-LINEAGE 100% field coverage, lineage tracking decorators
OBSERVABILITY-XRAY Auto X-Ray instrumentation for Lambdas
MEDALLION-ARCHITECTURE Bronze=immutable, Silver=validated, Gold=aggregated
ZERO-TRUST Deny-by-default IAM, explicit allow rules

Content depth tiers

Not every project needs DDD on day one.

Tier Includes Best for
core Code standards, testing, commit protocol New/small projects
recommended + architecture, CI/CD, clean code, deploy Most projects (default)
optional + DDD, CQRS, event sourcing, design patterns Mature teams, complex domains

Set in forgecraft.yaml:

projectName: my-api
tags: [UNIVERSAL, API]
tier: recommended

CLI Commands

npx forgecraft-mcp <command> [dir] [flags]
Command Purpose
setup <dir> Start here. Analyze → auto-detect stack → generate instruction files + hooks
refresh <dir> Re-scan after project changes. Detects new tags, shows before/after diff.
refresh <dir> --apply Apply the refresh (default is preview-only)
audit <dir> Score compliance (0-100). Reads tags from forgecraft.yaml.
scaffold <dir> --tags ... Generate full folder structure + instruction files
review [dir] --tags ... Structured code review checklist (4 dimensions)
list tags Show all 24 available tags
list hooks --tags ... Show quality-gate hooks for given tags
list skills --tags ... Show skill files for given tags
classify [dir] Analyze code to suggest tags
generate <dir> Regenerate instruction files only
convert <dir> Phased migration plan for legacy code
add-hook <name> <dir> Add a quality-gate hook
add-module <name> <dir> Scaffold a feature module

Common flags

--tags UNIVERSAL API     Project classification tags (or read from forgecraft.yaml)
--tier core|recommended  Content depth (default: recommended)
--targets claude cursor  AI assistant targets (default: claude)
--dry-run                Preview without writing files
--compact                Strip explanatory bullet tails and deduplicate lines (~20-40% smaller output)
--apply                  Apply changes (for refresh)
--language typescript    typescript | python (default: typescript)
--scope focused          comprehensive | focused (for review)

MCP Sentinel

Optionally add the ForgeCraft MCP sentinel to let your AI assistant diagnose your project and suggest the right CLI command:

The sentinel is a single minimal tool (~200 tokens per request, vs ~1,500 for a full tool suite). It checks whether forgecraft.yaml, your AI instruction file, and your hooks exist, then returns the targeted CLI command for the project's current state.

The design is intentional. The full ForgeCraft command surface — 21 actions — lives in the CLI, not the MCP server. The MCP server exposes exactly one tool that reads three artifacts and returns one recommendation. This is the Generative Specification principle in the tool's own architecture: a stateless reader, a bounded artifact set, a derived action. The tool practices what it writes into your instruction files.

A side effect: every declared MCP tool is read by the model on every turn whether invoked or not. One tool costs 200 tokens. Twenty-one tools costs 1,500. The sentinel keeps the methodology's recommended MCP budget (≤3 active servers) by design.

Recommended workflow:

  1. Add the sentinel to your AI assistant (see config examples below)
  2. Let your AI assistant run npx forgecraft-mcp setup .
  3. Remove the sentinel from your active MCP config
  4. Re-add it when you need to refresh or audit
Manual MCP config — Claude

Add to .claude/settings.json:

{
  "mcpServers": {
    "forgecraft": {
      "command": "npx",
      "args": ["-y", "forgecraft-mcp"]
    }
  }
}
Manual MCP config — GitHub Copilot (VS Code)

Add to .vscode/mcp.json in your project root (create it if it doesn't exist):

{
  "servers": {
    "forgecraft": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "forgecraft-mcp"]
    }
  }
}

Then open the Copilot Chat panel, switch to Agent mode, and the forgecraft sentinel will appear in the tools list.

Manual MCP config — Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "forgecraft": {
      "command": "npx",
      "args": ["-y", "forgecraft-mcp"]
    }
  }
}

No MCP client? That's fine — you don't need it. Run npx forgecraft-mcp setup . directly in your terminal. The MCP sentinel is optional; the CLI does everything.

Already ran claude init? Use npx forgecraft-mcp generate . --merge to merge with your existing CLAUDE.md, keeping your custom sections while adding production standards.


Free and open source

ForgeCraft is free. No limits, no tiers, no API keys.

The quality gate library grows through community contribution. If you propose a gate that gets accepted, your name goes in CONTRIBUTORS.md and you helped raise the floor for everyone building with AI.

Open a gate proposal → · See contributors →

Running this with a team? → forgeworkshop.dev


Theoretical foundation

ForgeCraft implements the Generative Specification model — a formal 7-property framework for evaluating AI-generated code quality. The model, the S_realized convergence formula, and the release phase framework are documented in the white paper.

Generative Specification White Paper — the academic foundation behind the verify score

The white paper is the theory. ForgeCraft is the toolchain. Quality gates proposed for the library that generalize into theoretical insights may be incorporated into future white paper revisions.


Configuration

Fine-tune what your AI assistant sees

# forgecraft.yaml
projectName: my-api
tags: [UNIVERSAL, API, FINTECH]
tier: recommended
outputTargets: [claude, cursor, copilot]  # Generate for multiple assistants
compact: true                             # Slim output (~20-40% fewer tokens)

exclude:
  - cqrs-event-patterns    # Don't need this yet

variables:
  coverage_minimum: 90      # Override defaults
  max_file_length: 400

Community template packs

templateDirs:
  - ./my-company-standards
  - node_modules/@my-org/forgecraft-flutter/templates

Keeping Standards Fresh

Audit (run anytime, or in CI)

Score: 72/100  Grade: C

✅ Instruction files exist
✅ Hooks installed (3/3)
✅ Test script configured
🔴 hardcoded_url: src/auth/service.ts
🔴 status_md_current: not updated in 12 days
🟡 lock_file: not committed

Refresh (project scope changed?)

npx forgecraft-mcp refresh . --apply

Or in preview mode first (default):

npx forgecraft-mcp refresh .   # shows before/after diff without writing

Contributing

Templates are YAML, not code. You can add patterns without writing TypeScript.

templates/your-tag/
├── instructions.yaml   # Instruction file blocks (with tier metadata)
├── structure.yaml      # Folder structure
├── nfr.yaml            # Non-functional requirements
├── hooks.yaml          # Quality gate scripts
├── review.yaml         # Code review checklists
└── mcp-servers.yaml    # Recommended MCP servers for this tag

PRs welcome. See templates/universal/ for the format.

MCP Server Discovery

npx forgecraft-mcp configure-mcp dynamically discovers recommended MCP servers matching your project tags. Servers are curated in mcp-servers.yaml per tag — community-contributable via PRs.

Built-in recommendations include Context7 (docs), Playwright (testing), Chrome DevTools (debugging), Stripe (fintech), Docker/K8s (infra), and more across all 24 tags.

Optionally fetch from a remote registry at setup time:

# In forgecraft.yaml or via tool parameter
include_remote: true
remote_registry_url: https://your-org.com/mcp-registry.json

Development

git clone https://github.com/jghiringhelli/forgecraft-mcp.git
cd forgecraft-mcp
npm install
npm run build
npm test   # 610 tests, 42 suites

License

MIT