Package Exports
- @dzhechkov/harness-cli
Readme
@dzhechkov/harness-cli
The dz CLI — the main entry point to the DZ Harness Hub. Install AI skills for Claude Code, Codex, OpenCode, Hermes, OpenClaude from a single command.
Install
npm install -g @dzhechkov/harness-cliUser Journey — from install to mastery
All 23 commands mapped to a real workflow:
DISCOVER → INSTALL → USE → CREATE → IMPROVE → SHAREPhase 1: Discover (what's available?)
npm install -g @dzhechkov/harness-cli # install the CLI
dz help # see all commands
dz pretrain # analyze project → auto-recommend skills/presets
dz recommend "build API and deploy to K8s" # task advisor — decomposes your task into skills
dz stats # 28 packages, 59 skills, 5 targets, 10 presets
dz dashboard # visual panel — packages, adapters, skill packs
dz registry # browse all 59 skills by category
dz registry search kubernetes # find specific skills
dz registry --category devops # filter by domain
dz downloads # npm weekly download statsPhase 2: Install (set up your workspace)
# Full setup with self-learning (recommended):
dz setup --target claude-code --preset devops # pretrain + hooks + memory + skills
# Or just install skills (no learning):
dz init --target claude-code --preset devops # 24 DevOps skills
dz init --target openclaude --preset web3 # 12 DeFi skills for OpenClaude
dz init --target codex --preset mcp # 10 MCP skills for Codex
# Or pick individual skills:
dz init --target claude-code --select terraform,kubernetes,docker-compose
# Or install from any npm package:
dz install @dzhechkov/skills-devops # npm install + copy skills
# Verify everything is correct:
dz verify # structural validation
dz doctor # 7 health checks
dz list # show installed skills
dz info --id terraform # detailed info about a skillPhase 3: Use (work with your agent)
# Now use Claude Code / Codex / OpenCode / Hermes normally.
# Skills are auto-discovered from the platform's skills directory.
# Example in Claude Code:
# "Review this PR" → pr-review skill activates
# "Design an API" → api-design skill activates
# "Fix this CI" → ci-fix skill activatesPhase 4: Create (build your own skills)
# Scaffold a new skill:
dz create-skill --name my-skill --description "What it does" --tier 2
# With BTO-compatible eval templates:
dz create-skill --name my-skill --bto
# Benchmark your skill (aim for Grade A):
dz benchmark .claude/skills/my-skill # single skill — 19 L0 checks
dz benchmark packages/@dzhechkov/skills-devops --all # batch all
dz benchmark skill-a --compare skill-b # A/B compare
# Find skills to canonicalize from the ecosystem:
dz scout # scan 9 sources (GitHub, npm, HN, ...)
dz scout --deep # deep analysis with SKILL.md parsing
dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devopsPhase 5: Maintain (keep skills fresh)
# Check for upstream changes (canonicalized skills):
dz sync-upstream --list # which packages have external sources?
dz sync-upstream --all # check all against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops # check one
# Check installed skills vs canonical:
dz upgrade # shows which skills need update
dz upgrade --target openclaude # check specific platform
# Sync canonical to legacy layout:
dz sync # canonical → project skills
dz migrate # detect legacy installations
# Orchestrate dynamic workflows:
dz workflow --task coverage-lift # parallel coverage improvement
dz workflow --task security-audit # adversarial security scan
# Cross-host state sync:
dz roam --apply # sync agent state across machinesPhase 6: Share (publish to the world)
# Publish updated packages to npm:
dz publish --dry-run # preview
dz publish --filter skills-devops # publish specific package
dz publish # publish all changed packagesPresets vs Individual Skills
| Approach | When to use | Example |
|---|---|---|
| Preset | Want a curated set of skills that work together | dz init --target claude-code --preset keysarium |
| --select | Want specific skills by name | dz init --target claude-code --select explore,feature-adr |
| Standalone npx | Want a full toolkit with its own CLI | npx @dzhechkov/keysarium init |
Available Presets (10)
| Preset | Skills | Description |
|---|---|---|
meta |
3 | Development process (explore, feature-adr, knowledge-extractor) |
qe-engineer |
8 | Quality engineering (test-gen, coverage, chaos, defect, ...) |
bto |
1 | Build-Benchmark-Test-Optimize pipeline |
health |
8 | Medical AI (diagnostics, drugs, labs, clinical decisions) |
keysarium |
9 | Full research toolkit (feature-adr, presentation, reverse-eng) |
p-replicator |
10 | AI product development (/replicate, SPARC PRD, pipeline-forge) |
feature-adr |
5 | Feature pipeline (feature-adr, explore, frontend-design) |
devops |
24 | DevOps skills (terraform, kubernetes, nginx, redis, graphql, playwright, ...) |
web3 |
12 | Web3/DeFi (quicknode, zerion, symbiosis, bankr, veil, neynar, ...) |
mcp |
13 | MCP servers (brave-search, exa, gmail, notion, obsidian, git-mcp, ...) |
Standalone Packages (install via npx, no dz CLI needed)
npx @dzhechkov/keysarium init # full research toolkit
npx @dzhechkov/p-replicator init # AI product development
npx @dzhechkov/health-advisor init # medical AI (25 skills)
npx @dzhechkov/skills-bto init # BTO benchmarking
npx @dzhechkov/skills-feature-adr init # 11-step feature pipeline
npx @dzhechkov/skills-edu-site init # gamified edu site generator
npx @dzhechkov/skills-transcript-site init # transcript → interactive site
npx @dzhechkov/skills-analyst-manual init # 3-phase analyst compositeDifference: dz init --preset installs individual skills from .claude/skills/ source into a target platform tree. Standalone npx packages have their own CLI and install a complete toolkit with commands, rules, shards, and agents — a richer but self-contained experience.
All Commands (28)
dz setup --target <name> [--preset <name>] [--no-hooks] [--no-memory] [--force]
dz init --target <name> [--preset <name>] [--select id,id,...] [--force]
dz install <npm-pkg> [--target <name>] [--project <dir>]
dz pretrain [--project <dir>]
dz recommend "<task description>"
dz compose <preset1+preset2+...> [--target <name>]
dz diff <skill-dir>
dz upgrade [--target <name>] [--project <dir>]
dz verify [--skills-dir <dir>] [--target <name>]
dz sync [--canonical <dir>] [--project <dir>] [--dry-run] [--force]
dz update (alias for sync)
dz list [--skills-dir <dir>]
dz info --id <skill-id> [--skills-dir <dir>]
dz create-skill --name <id> [--description <text>] [--tier 1|2|3] [--bto]
dz registry [search <query>] [--category <cat>]
dz benchmark <skill-dir> [--compare <dir>] [--all]
dz publish [--filter <name>] [--dry-run] [--bump-only]
dz auto-canonicalize --source <github-url> --pack <skills-pack>
dz sync-upstream [--package <dir>] [--list] [--all]
dz scout [--topics <list>] [--since <date>] [--deep]
dz workflow --task <name> [--dry-run]
dz downloads
dz migrate [--project <dir>]
dz stats
dz dashboard
dz doctor [--project <dir>]
dz roam [--apply] [--slug <slug>]
dz helpTargets (5 platforms)
| Target | Skills directory |
|---|---|
claude-code |
.claude/skills/ |
codex |
.agents/skills/ |
opencode |
.opencode/skills/ |
hermes |
.hermes/skills/ |
openclaude |
.openclaude/skills/ |
Workflows (Opus 4.8+ dynamic workflows)
dz workflow --task coverage-lift # parallel coverage improvement
dz workflow --task mutation-kill # kill surviving mutants
dz workflow --task canonicalize # canonicalize new packages
dz workflow --task security-audit # adversarial security scanScout (ecosystem intelligence)
dz scout # quick scan — radar mode
dz scout --deep # deep analysis — AI analyst mode
dz scout --topics mcp-server,ai-agent # custom topics
dz scout --since 2026-05-01 # only recent reposRadar mode (dz scout) scans **9 sources in parallel (GitHub + npm + HN + MCP Registry + Glama + OSSInsight + Smithery + Semantic Scholar + arXiv):
- Detects skill format — SKILL.md, plugin.json, .claude/skills/, MCP manifests
- Scores relevance — format (40%) + stars (30%) + recency (20%) + novelty (10%)
- Compares against our 24 packages — finds skills we don't have
- Recommends — integrate (score ≥70) / monitor (40-69 + ≥50 stars) / skip
Deep analyst mode (dz scout --deep) goes further for top-scored repos:
- Downloads SKILL.md from each repo, parses frontmatter + body
- Finds closest match in our inventory by keyword overlap
- Explains the delta — what the found skill adds that ours doesn't
- Recommends integration path:
- canonicalize — high-signal novel skill → new
@dzhechkov/skills-*pack - merge — similar to existing skill → add unique features to ours
- new-preset — novel skill → add to preset or create new pack
- skip — already in our inventory
- canonicalize — high-signal novel skill → new
- Gap analysis — identifies trending categories across the ecosystem that our harness lacks
Example deep analysis output:
## 🔬 Deep Analysis
### cool/agent-toolkit (★500)
2/3 skills are novel
| Skill | Description | Closest match | Integration | Rationale |
|-------|------------|---------------|-------------|-----------|
| code-review | Automated OWASP-focused review | brutal-honesty-review | **merge** | Similar to ours — merge OWASP checklist |
| deploy-check | Pre-deploy validation gates | — | **canonicalize** | High-signal novel skill (500 stars) |
## 📊 Harness Gap Analysis
| Category | Frequency | Recommendation |
|----------|-----------|---------------|
| deploy-automation | 12 repos | Create @dzhechkov/skills-devops — high demand |
| data-pipeline | 5 repos | Monitor — emerging trend |Powered by @dzhechkov/scout.
BTO integration (create-skill --bto)
# Scaffold a new skill with BTO-compatible 3-layer evaluation:
dz create-skill --name my-skill --bto
# What you get:
# evals/my-skill.yaml — BTO eval with L0/L1/L2 layers
# references/judge-rubrics.md — scoring rubrics for 3-judge panelThe --bto flag generates eval templates compatible with /bto-test:
| Layer | What | Gate |
|---|---|---|
| L0 | Deterministic checks (U1-U5 universal + S1-S10 skill-specific) | Pass rate >= 80% |
| L1 | Single LLM judge (Haiku) — 5 dimensions: Clarity, Completeness, Actionability, Quality, Anti-patterns | Average >= 7.0 |
| L2 | 3-judge panel (Sonnet) — Expert (0.40), Critic (0.30), Auditor (0.30) — 5 dimensions: Methodology, Depth, Correctness, Usability, Robustness | Weighted avg >= 7.0 |
After scaffolding, fill in the SKILL.md protocol and run /bto-test .claude/skills/my-skill to evaluate.
dz install — install skills from any npm package
# Install skills from any npm package directly
dz install @dzhechkov/skills-devops
dz install @dzhechkov/skills-web3 --target openclaude
dz install @lythos/skill-curator --target claude-codeRuns npm install, discovers SKILL.md files in the package, copies them to the target platform directory. Works with any agentskills.io-compatible npm package.
dz sync-upstream — check for upstream updates
dz sync-upstream --list # show packages with external sources
dz sync-upstream --all # check ALL packages against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops # check one packageDiscovers all skill packs with sources.json, fetches SKILL.md from origin repos, reports which skills have upstream changes.
dz upgrade — check installed skills for updates
dz upgrade # check .claude/skills/ against canonical
dz upgrade --target openclaude # check .openclaude/skills/Compares installed skills with canonical source, reports which need dz init --force to update.
dz downloads — npm weekly download stats
dz downloads # fetch weekly downloads for all 28 packagesdz benchmark — L0 quality gate
dz benchmark packages/@dzhechkov/skills-devops/terraform # single skill
dz benchmark packages/@dzhechkov/skills-devops --all # batch all
dz benchmark skill-a --compare skill-b # A/B compare19 deterministic checks (U1-U5 universal + S1-S14 skill-specific). Grade A = 95%+. For L1/L2 LLM judges, use /bto-test inside Claude Code.
dz publish — automated npm publish
dz publish --dry-run # preview what would publish
dz publish --filter skills-devops # publish specific package
dz publish --filter skills-devops --bump-only # bump version only, no publishdz auto-canonicalize — discover skills in GitHub repos
dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devopsScans a GitHub repo for SKILL.md files, generates dz create-skill commands.
dz registry — searchable skill index
dz registry # visual panel: 59 skills in 5 categories
dz registry search security # fuzzy search
dz registry --category mcp # filter by categorydz stats + dz dashboard
dz stats # Quick metrics: packages, skills, targets, presets
dz dashboard # Visual panel with all packages, adapters, skill packsHow it works
dz initcompiles canonical skills from the agentskills.io standard into the target platform's layout- Writing is additive — existing files are never overwritten without
--force - All 5 platform adapters produce byte-identical output (ADR-005)
dz doctorruns 7 health checks (node version, adapters, config, SQLite, skills)dz migratedetects legacy keysarium/bto installations and recommends migration path
Use Cases
1. Short-term product research (one-off study)
Goal: Quickly research a product idea, competitors, market — get a structured report.
# Option A: via dz CLI
dz init --target claude-code --preset meta
# Then in Claude Code:
# /explore "Research the market for AI-powered code review tools"
# /feature-adr "Summarize findings into an ADR"
# Option B: via keysarium (full 7-phase pipeline)
npx @dzhechkov/keysarium init
# Then in Claude Code:
# /casarium "AI-powered code review tools — market analysis"
# → Phase 0: Discovery → Phase 1: Exploration → Phase 2: Paranoid Research
# → Phase 3: Solution Design → Phase 4: Architecture → Phase 5: PresentationWhat you get:
metapreset:/exploreclarifies the problem →/feature-adrstructures findings as ADR decisionskeysarium: full 7-phase pipeline with dream cycles, background workers, and presentation generation
Best for: Quick study (hours), competitive analysis, technology evaluation.
2. Long-term product research (evolving over time)
Goal: Continuously gather data, add new sources, and "recalculate" the product vision as insights accumulate.
# Install keysarium (research pipeline) + evidence-wiki (knowledge base)
npx @dzhechkov/keysarium init
# Copy evidence-wiki plugin into your project:
npx @dzhechkov/evidence-wiki # or git clone https://github.com/djd1m/evidence-wiki
npm install -g @dzhechkov/harness-cli
dz init --target claude-code --preset metaWorkflow — iterative research cycles with evidence wiki:
Week 1: /casarium "Product X — initial research"
→ researches/ directory created with findings
→ .keysarium/memory/ stores patterns + reward scores
/wiki-generate ← evidence-wiki
→ Scans researches/, ADRs, docs
→ Generates wiki/concepts/*.md (atomic pages with inline sources)
→ Builds wiki/graph.json (knowledge graph)
→ wiki/INDEX.md links everything
Week 2: Add new data → /casarium "Product X — update with Q2 metrics"
→ Memory recalls Week 1 patterns (reward-calibrated learning)
→ New findings merged with existing, conflicts resolved
/wiki-generate --check ← re-generates wiki
→ New concepts added, existing updated
→ Every claim verified: triple-pillar protocol requires N independent
typed sources (ADR + methodology + research)
→ Stale concepts flagged, broken evidence links detected
/triple-check wiki/concepts/pricing-model.md ← verify specific page
→ Checks that every factual claim has inline source citations
→ Flags unsupported statements
Week N: /casarium "Product X — pivot analysis after customer feedback"
→ Full history in memory layer + evidence wiki
→ /harvest extracts reusable knowledge patterns
→ /wiki-generate rebuilds the entire knowledge graph
→ Product vision "recalculated" — the wiki IS the living product modelThe evidence-wiki advantage:
| Without evidence-wiki | With evidence-wiki |
|---|---|
| Research in markdown files | Atomic concept pages with inline sources |
Findings scattered across researches/ |
Interlinked knowledge graph (graph.json) |
| "I think we decided X" | Every claim has a cited source (triple-pillar) |
| Hard to see what changed | /wiki-generate --check diffs the knowledge base |
| No verification | /triple-check enforces evidence discipline |
Key features for long-term research:
- Evidence wiki (
@dzhechkov/evidence-wiki): atomic concept pages where every factual claim carries inline sources; knowledge graph for cross-referencing; triple-pillar protocol (N independent typed sources per claim) - Reward-calibrated memory (
@dzhechkov/memoryReflexion): each checkpoint response trains the system — "ок" = excellent (1.0), feedback = good (0.7), rework = needs_work (0.3) - Agent SDK Dreaming: between sessions, patterns are consolidated and distilled
/harvest(knowledge-extractor skill): extracts reusable patterns from completed research intolib/templates- SQLite + FTS5 backend: scales to 100k+ records with full-text search across all research sessions
Best for: Product strategy over months, continuous market monitoring, evolving product vision with evidence-backed decisions.
3. Product research + working prototype
Goal: Research the product AND build a functional prototype.
Option A: Sequential — research first, then code
# Step 1: Install research + development presets
npx @dzhechkov/keysarium init
# OR:
dz init --target claude-code --preset keysarium
# Step 2: Research phase
# /casarium "SaaS platform for team retrospectives"
# → Phase 0-2: Discovery, Exploration, Paranoid Research
# → Phase 3: Solution Design (with CJM prototype)
# → Result: researches/<slug>/ with full analysis
# Step 3: Switch to development
dz init --target claude-code --preset feature-adr
# Step 4: Build using research outputs
# /feature-adr "Build the retrospective platform based on research in researches/<slug>/"
# → Step 0: Router classifies as L/XL
# → Step 1-5: Requirements, ADRs, DDD, Architecture (informed by research)
# → Step 6: Implementation plan
# → Step 7: Code generation (with /frontend-design for UI)
# → Step 8-9: QE review + fleet assessmentWhat you get: Research artifacts in researches/, then code in features/<slug>/ + actual repository changes. Research directly feeds into ADR decisions.
Option B: Parallel — research and code simultaneously with p-replicator
# Install the full product development toolkit
npx @dzhechkov/p-replicator init
# Single pipeline: research → requirements → prototype
# /replicate "SaaS platform for team retrospectives"
# → Reverse-engineers similar products (reverse-engineering-unicorn)
# → Generates SPARC PRD (sparc-prd-mini)
# → Validates requirements (requirements-validator)
# → Creates the project structure (pipeline-forge)
# → Builds the prototype (cc-toolkit-generator-enhanced)
# → Reviews with brutal honesty (brutal-honesty-review)What you get: A working prototype generated from research in a single /replicate pipeline run. Faster but less deep than Option A.
Comparison
| Aspect | Option A (Sequential) | Option B (p-replicator) |
|---|---|---|
| Research depth | Deep (7-phase keysarium) | Moderate (reverse-engineering) |
| Code quality | High (11-step feature-adr + QE) | Good (pipeline-forge + review) |
| Time | Days to weeks | Hours to days |
| Best for | Complex products, regulated domains | MVPs, hackathons, quick validation |
| Packages | keysarium + feature-adr preset |
p-replicator |
| Research artifacts | researches/ directory |
Embedded in PRD |
| Code artifacts | features/<slug>/ + repo changes |
Generated project |
Tip: For maximum rigor, combine both — use p-replicator for a quick prototype, then run /feature-adr --full-qe-extended on the generated code for production-grade quality engineering.
Status
v0.3.24 — published on npm. Part of DZ Harness Hub.