Package Exports

@dzhechkov/harness-cli

Readme

@dzhechkov/harness-cli

The dz CLI — the main entry point to the DZ Harness Hub. Install AI skills for Claude Code, Codex, OpenCode, Hermes from a single command.

Install

npm install -g @dzhechkov/harness-cli

Quick Start

# Install a preset (curated skill set) for your platform:
dz init --target claude-code --preset meta

# That's it — Claude Code will auto-discover the installed skills.

Presets vs Individual Skills

Approach	When to use	Example
Preset	Want a curated set of skills that work together	`dz init --target claude-code --preset keysarium`
--select	Want specific skills by name	`dz init --target claude-code --select explore,feature-adr`
Standalone npx	Want a full toolkit with its own CLI	`npx @dzhechkov/keysarium init`

Available Presets (7)

Preset	Skills	Description
`meta`	3	Development process (explore, feature-adr, knowledge-extractor)
`qe-engineer`	8	Quality engineering (test-gen, coverage, chaos, defect, ...)
`bto`	1	Build-Benchmark-Test-Optimize pipeline
`health`	8	Medical AI (diagnostics, drugs, labs, clinical decisions)
`keysarium`	9	Full research toolkit (feature-adr, presentation, reverse-eng)
`p-replicator`	10	AI product development (/replicate, SPARC PRD, pipeline-forge)
`feature-adr`	5	Feature pipeline (feature-adr, explore, frontend-design)

Standalone Packages (install via npx, no dz CLI needed)

npx @dzhechkov/keysarium init              # full research toolkit
npx @dzhechkov/p-replicator init           # AI product development
npx @dzhechkov/health-advisor init         # medical AI (25 skills)
npx @dzhechkov/skills-bto init             # BTO benchmarking
npx @dzhechkov/skills-feature-adr init     # 11-step feature pipeline
npx @dzhechkov/skills-edu-site init        # gamified edu site generator
npx @dzhechkov/skills-transcript-site init # transcript → interactive site
npx @dzhechkov/skills-analyst-manual init  # 3-phase analyst composite

Difference: dz init --preset installs individual skills from .claude/skills/ source into a target platform tree. Standalone npx packages have their own CLI and install a complete toolkit with commands, rules, shards, and agents — a richer but self-contained experience.

All Commands (11)

dz init      --target <name> [--preset <name>] [--select id,id,...] [--force]
dz verify    [--skills-dir <dir>] [--target <name>]
dz sync      [--canonical <dir>] [--project <dir>] [--dry-run] [--force]
dz update    (alias for sync)
dz list      [--skills-dir <dir>]
dz info      --id <skill-id> [--skills-dir <dir>]
dz migrate   [--project <dir>]
dz scout     [--topics <list>] [--since <date>]
dz workflow  --task <name> [--dry-run]
dz doctor    [--project <dir>]
dz roam      [--apply] [--slug <slug>]
dz help

Targets (4 platforms)

Target	Skills directory
`claude-code`	`.claude/skills/`
`codex`	`.agents/skills/`
`opencode`	`.opencode/skills/`
`hermes`	`.hermes/skills/`

Workflows (Opus 4.8+ dynamic workflows)

dz workflow --task coverage-lift     # parallel coverage improvement
dz workflow --task mutation-kill     # kill surviving mutants
dz workflow --task canonicalize      # canonicalize new packages
dz workflow --task security-audit    # adversarial security scan

Scout (ecosystem intelligence)

dz scout                              # scan GitHub for agent-skill repos
dz scout --topics mcp-server,ai-agent # custom topics
dz scout --since 2026-05-01           # only recent repos

Scout scans GitHub for repos tagged with agent-skills, claude-code-skills, mcp-server, etc. For each repo it:

Detects skill format — SKILL.md, plugin.json, .claude/skills/, MCP manifests
Scores relevance — format (40%) + stars (30%) + recency (20%) + novelty (10%)
Compares against our 24 packages — finds skills we don't have
Recommends — integrate (score ≥70) / monitor (40-69 + ≥50 stars) / skip

Output: a markdown intelligence report with repo table, scores, and novel skills.

How it works

dz init compiles canonical skills from the agentskills.io standard into the target platform's layout
Writing is additive — existing files are never overwritten without --force
All 4 platform adapters produce byte-identical output (ADR-005)
dz doctor runs 7 health checks (node version, adapters, config, SQLite, skills)
dz migrate detects legacy keysarium/bto installations and recommends migration path

Use Cases

1. Short-term product research (one-off study)

Goal: Quickly research a product idea, competitors, market — get a structured report.

# Option A: via dz CLI
dz init --target claude-code --preset meta
# Then in Claude Code:
#   /explore "Research the market for AI-powered code review tools"
#   /feature-adr "Summarize findings into an ADR"

# Option B: via keysarium (full 7-phase pipeline)
npx @dzhechkov/keysarium init
# Then in Claude Code:
#   /casarium "AI-powered code review tools — market analysis"
#   → Phase 0: Discovery → Phase 1: Exploration → Phase 2: Paranoid Research
#   → Phase 3: Solution Design → Phase 4: Architecture → Phase 5: Presentation

What you get:

meta preset: /explore clarifies the problem → /feature-adr structures findings as ADR decisions
keysarium: full 7-phase pipeline with dream cycles, background workers, and presentation generation

Best for: Quick study (hours), competitive analysis, technology evaluation.

2. Long-term product research (evolving over time)

Goal: Continuously gather data, add new sources, and "recalculate" the product vision as insights accumulate.

# Install keysarium (research pipeline) + evidence-wiki (knowledge base)
npx @dzhechkov/keysarium init
# Copy evidence-wiki plugin into your project:
npx @dzhechkov/evidence-wiki   # or git clone https://github.com/djd1m/evidence-wiki

npm install -g @dzhechkov/harness-cli
dz init --target claude-code --preset meta

Workflow — iterative research cycles with evidence wiki:

Week 1:  /casarium "Product X — initial research"
         → researches/ directory created with findings
         → .keysarium/memory/ stores patterns + reward scores

         /wiki-generate                              ← evidence-wiki
         → Scans researches/, ADRs, docs
         → Generates wiki/concepts/*.md (atomic pages with inline sources)
         → Builds wiki/graph.json (knowledge graph)
         → wiki/INDEX.md links everything

Week 2:  Add new data → /casarium "Product X — update with Q2 metrics"
         → Memory recalls Week 1 patterns (reward-calibrated learning)
         → New findings merged with existing, conflicts resolved

         /wiki-generate --check                      ← re-generates wiki
         → New concepts added, existing updated
         → Every claim verified: triple-pillar protocol requires N independent
           typed sources (ADR + methodology + research)
         → Stale concepts flagged, broken evidence links detected

         /triple-check wiki/concepts/pricing-model.md ← verify specific page
         → Checks that every factual claim has inline source citations
         → Flags unsupported statements

Week N:  /casarium "Product X — pivot analysis after customer feedback"
         → Full history in memory layer + evidence wiki
         → /harvest extracts reusable knowledge patterns
         → /wiki-generate rebuilds the entire knowledge graph
         → Product vision "recalculated" — the wiki IS the living product model

The evidence-wiki advantage:

Without evidence-wiki	With evidence-wiki
Research in markdown files	Atomic concept pages with inline sources
Findings scattered across `researches/`	Interlinked knowledge graph (`graph.json`)
"I think we decided X"	Every claim has a cited source (triple-pillar)
Hard to see what changed	`/wiki-generate --check` diffs the knowledge base
No verification	`/triple-check` enforces evidence discipline

Key features for long-term research:

Evidence wiki (@dzhechkov/evidence-wiki): atomic concept pages where every factual claim carries inline sources; knowledge graph for cross-referencing; triple-pillar protocol (N independent typed sources per claim)
Reward-calibrated memory (@dzhechkov/memory Reflexion): each checkpoint response trains the system — "ок" = excellent (1.0), feedback = good (0.7), rework = needs_work (0.3)
Agent SDK Dreaming: between sessions, patterns are consolidated and distilled
/harvest (knowledge-extractor skill): extracts reusable patterns from completed research into lib/ templates
SQLite + FTS5 backend: scales to 100k+ records with full-text search across all research sessions

Best for: Product strategy over months, continuous market monitoring, evolving product vision with evidence-backed decisions.

3. Product research + working prototype

Goal: Research the product AND build a functional prototype.

Option A: Sequential — research first, then code

# Step 1: Install research + development presets
npx @dzhechkov/keysarium init
# OR:
dz init --target claude-code --preset keysarium

# Step 2: Research phase
#   /casarium "SaaS platform for team retrospectives"
#   → Phase 0-2: Discovery, Exploration, Paranoid Research
#   → Phase 3: Solution Design (with CJM prototype)
#   → Result: researches/<slug>/ with full analysis

# Step 3: Switch to development
dz init --target claude-code --preset feature-adr

# Step 4: Build using research outputs
#   /feature-adr "Build the retrospective platform based on research in researches/<slug>/"
#   → Step 0: Router classifies as L/XL
#   → Step 1-5: Requirements, ADRs, DDD, Architecture (informed by research)
#   → Step 6: Implementation plan
#   → Step 7: Code generation (with /frontend-design for UI)
#   → Step 8-9: QE review + fleet assessment

What you get: Research artifacts in researches/, then code in features/<slug>/ + actual repository changes. Research directly feeds into ADR decisions.

Option B: Parallel — research and code simultaneously with p-replicator

# Install the full product development toolkit
npx @dzhechkov/p-replicator init

# Single pipeline: research → requirements → prototype
#   /replicate "SaaS platform for team retrospectives"
#   → Reverse-engineers similar products (reverse-engineering-unicorn)
#   → Generates SPARC PRD (sparc-prd-mini)
#   → Validates requirements (requirements-validator)
#   → Creates the project structure (pipeline-forge)
#   → Builds the prototype (cc-toolkit-generator-enhanced)
#   → Reviews with brutal honesty (brutal-honesty-review)

What you get: A working prototype generated from research in a single /replicate pipeline run. Faster but less deep than Option A.

Comparison

Aspect	Option A (Sequential)	Option B (p-replicator)
Research depth	Deep (7-phase keysarium)	Moderate (reverse-engineering)
Code quality	High (11-step feature-adr + QE)	Good (pipeline-forge + review)
Time	Days to weeks	Hours to days
Best for	Complex products, regulated domains	MVPs, hackathons, quick validation
Packages	`keysarium` + `feature-adr` preset	`p-replicator`
Research artifacts	`researches/` directory	Embedded in PRD
Code artifacts	`features/<slug>/` + repo changes	Generated project

Tip: For maximum rigor, combine both — use p-replicator for a quick prototype, then run /feature-adr --full-qe-extended on the generated code for production-grade quality engineering.

Status

v0.2.1 — published on npm. Part of DZ Harness Hub.