Package Exports

@dzhechkov/harness-cli

Readme

@dzhechkov/harness-cli

The dz CLI — the main entry point to the DZ Harness Hub. Install AI skills for Claude Code, Codex, OpenCode, Hermes, OpenClaude from a single command.

Install

npm install -g @dzhechkov/harness-cli

# Update to latest version (run from outside any workspace project):
cd /tmp && npm install -g @dzhechkov/harness-cli@latest

Note: If you get EUNSUPPORTEDPROTOCOL workspace:*, you're inside a pnpm/yarn workspace. Run the install from /tmp or ~ instead.

User Journey — from install to mastery

All 31 commands mapped to a real workflow:

DISCOVER → INSTALL → USE → CREATE → MAINTAIN → SHARE

Phase 1: Discover (what's available?)

npm install -g @dzhechkov/harness-cli    # install the CLI

dz help                                   # see all commands
dz pretrain                                # analyze project files → recommend by tech stack
dz recommend "build API and deploy to K8s" # keyword match → skills + toolkits
dz recommend "work on this project"        # generic? → auto-runs pretrain → recommends by stack
dz stats                                  # 31 packages, 115 skills, 5 targets, 11 presets
dz dashboard                              # visual panel — packages, adapters, skill packs
dz registry                               # browse all 115 skills by category
dz registry search kubernetes             # find specific skills
dz registry --category devops             # filter by domain
dz downloads                              # npm weekly download stats

Phase 2: Install (set up your workspace)

# Full setup with self-learning (recommended):
dz setup --target claude-code --preset devops  # pretrain + hooks + JSONL memory

# With AgentDB vector memory (semantic search + self-learning):
dz setup --target claude-code --preset devops --memory agentdb  # .rvf + 41 MCP tools

# Or just install skills (no learning):
dz init --target claude-code --preset devops   # 27 DevOps skills
dz init --target openclaude --preset web3      # 12 DeFi skills for OpenClaude
dz init --target codex --preset mcp            # 16 MCP skills for Codex

# Or pick individual skills:
dz init --target claude-code --select terraform,kubernetes,docker-compose

# Or install from any npm package:
dz install @dzhechkov/skills-devops            # npm install + copy skills

# Verify everything is correct:
dz verify                                       # structural validation
dz doctor                                       # 7 health checks
dz list                                         # show installed skills
dz info --id terraform                          # detailed info about a skill

Phase 3: Use (work with your agent)

# Now use Claude Code / Codex / OpenCode / Hermes normally.
# Skills are auto-discovered from the platform's skills directory.
# Example in Claude Code:
#   "Review this PR" → pr-review skill activates
#   "Design an API" → api-design skill activates
#   "Fix this CI" → ci-fix skill activates

Phase 4: Create (build your own skills)

# Scaffold a new skill:
dz create-skill --name my-skill --description "What it does" --tier 2

# With BTO-compatible eval templates:
dz create-skill --name my-skill --bto

# Benchmark your skill (aim for Grade A):
dz benchmark .claude/skills/my-skill           # single skill — 19 L0 checks
dz benchmark packages/@dzhechkov/skills-devops --all   # batch all
dz benchmark skill-a --compare skill-b          # A/B compare

# Find skills to canonicalize from the ecosystem:
dz scout                                        # scan 9 sources (GitHub, npm+plugins, HN, ...)
dz scout --deep                                 # deep analysis with SKILL.md parsing
dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Phase 5: Maintain (keep skills fresh)

# Check for upstream changes (canonicalized skills):
dz sync-upstream --list                                 # which packages have external sources?
dz sync-upstream --all                                  # check all against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one

# Check installed skills vs canonical:
dz upgrade                                      # shows which skills need update
dz upgrade --target openclaude                  # check specific platform

# Sync canonical to legacy layout:
dz sync                                         # canonical → project skills
dz migrate                                      # detect legacy installations

# Orchestrate dynamic workflows:
dz workflow --task coverage-lift                 # parallel coverage improvement
dz workflow --task security-audit               # adversarial security scan

# Cross-host state sync:
dz roam --apply                                 # sync agent state across machines

# Publish updated packages to npm:
dz publish --dry-run                            # preview
dz publish --filter skills-devops               # publish specific package
dz publish                                      # publish all changed packages

Three Ways to Install Skills

	Individual Skill	Preset	npx Package
What	1 SKILL.md file	Curated list of skill names	Full toolkit with orchestration
Contains	Instructions for 1 task	N skill references	Skills + commands + rules + shards + agents + memory
Pipeline	No	No	Yes (phases, checkpoints, governance)
Self-learning	No	`dz setup` adds it	Built-in
Install	`dz init --select X`	`dz setup --preset X`	`npx @dzhechkov/X init`
Example	`terraform`	`devops` (27 skills)	`keysarium` (7-phase research)

# One skill:
dz init --target claude-code --select design-thinking

# Curated set by topic (recommended):
dz setup --target claude-code --preset meta          # 15 development skills + self-learning

# Full toolkit with orchestrated pipeline:
npx @dzhechkov/keysarium init                        # 7-phase research + commands + memory

When to use which:

Need 1 specific capability → --select
Need a themed set that works together → --preset
Need a full pipeline with commands and governance → npx

Available Presets (11)

Preset	Skills	Description
`meta`	15	Development process (explore, goap-research, problem-solver, design-thinking, feature-adr, knowledge-extractor, understand-anything-bridge, agentshield-scan, adversarial-verifier)
`qe-engineer`	20	Quality engineering (test-gen, coverage, chaos, defect, ...)
`bto`	1	Build-Benchmark-Test-Optimize pipeline
`health`	8	Medical AI (diagnostics, drugs, labs, clinical decisions)
`keysarium`	9	Full research toolkit (feature-adr, presentation, reverse-eng)
`p-replicator`	10	AI product development (/replicate, SPARC PRD, pipeline-forge)
`feature-adr`	5	Feature pipeline (feature-adr, explore, frontend-design)
`devops`	27	DevOps skills (terraform, kubernetes, c4-architecture, risk-assessment, ...)
`web3`	12	Web3/DeFi (quicknode, zerion, symbiosis, bankr, veil, neynar, ...)
`mcp`	16	MCP servers (agentdb, brave-search, gmail, gitlab, comfyui, notion, ...)
`academic`	5	Thesis defense (review, questions, doc-check, live defense + answer eval)

Standalone Packages (install via npx, no dz CLI needed)

Package	Install	What it does
@dzhechkov/keysarium	`npx @dzhechkov/keysarium init`	Full 7-phase research toolkit
@dzhechkov/design-thinking	`npx @dzhechkov/design-thinking init`	d.school 6-phase Design Thinking (8 skills)
@dzhechkov/p-replicator	`npx @dzhechkov/p-replicator init`	AI product development (/replicate pipeline)
@dzhechkov/health-advisor	`npx @dzhechkov/health-advisor init`	Medical AI (25 skills)
@dzhechkov/skills-bto	`npx @dzhechkov/skills-bto init`	BTO benchmarking (Build-Test-Optimize)
@dzhechkov/skills-feature-adr	`npx @dzhechkov/skills-feature-adr init`	11-step feature pipeline
@dzhechkov/skills-edu-site	`npx @dzhechkov/skills-edu-site init`	Gamified edu site generator
@dzhechkov/skills-transcript-site	`npx @dzhechkov/skills-transcript-site init`	Transcript → interactive site
@dzhechkov/skills-analyst-manual	`npx @dzhechkov/skills-analyst-manual init`	3-phase analyst composite

Difference: dz init --preset installs individual skills from .claude/skills/ source into a target platform tree. Standalone npx packages have their own CLI and install a complete toolkit with commands, rules, shards, and agents — a richer but self-contained experience.

A skill and its npx toolkit are not duplicates — they're a graduation. Several skills (e.g. feature-adr, design-thinking) exist BOTH as a skill inside a dz preset AND as a standalone npx package. The preset's SKILL.md is fully functional on its own (the whole methodology — modules + references — travels with it, and it auto-activates by description), and it's the only way to compile that capability to the non-Claude platforms (Codex/OpenCode/Hermes/OpenClaude) via dz. The npx package adds project-level runtime governance around the same skill: a slash command, governance rules, a context shard, and (for feature-adr) reward-learning + /harvest. So: pick the skill/preset for a working capability across platforms; pick the npx toolkit when you want it as a governed, command-driven fixture of one project.

All Commands (31)

dz setup             --target <name> [--preset <name>] [--memory agentdb] [--no-hooks] [--install-driver] [--force]
dz init              --target <name> [--preset <name>] [--select id,id,...] [--force]
dz install           <npm-pkg> [--target <name>] [--project <dir>]
dz teach             "<pattern>" [--reward <0-1>] [--domain <name>]
dz pretrain          [--project <dir>]
dz recommend         "<task description>"
dz compose           <preset1+preset2+...> [--target <name>]
dz diff              <skill-dir>
dz upgrade           [--target <name>] [--project <dir>]
dz verify            [--skills-dir <dir>] [--target <name>]
dz sync              [--canonical <dir>] [--project <dir>] [--dry-run] [--force]
dz update            (alias for sync)
dz list              [--skills-dir <dir>]
dz info              --id <skill-id> [--skills-dir <dir>]
dz create-skill      --name <id> [--description <text>] [--tier 1|2|3] [--bto]
dz registry          [search <query>] [--category <cat>]
dz benchmark         <skill-dir> [--compare <dir>] [--all]
dz publish           [--filter <name>] [--dry-run] [--bump-only]
dz auto-canonicalize --source <github-url> --pack <skills-pack>
dz sync-upstream     [--package <dir>] [--list] [--all]
dz scout             [--topics <list>] [--since <date>] [--deep]
dz workflow          --task <name> [--dry-run]
dz plugin            [--version <ver>]
dz downloads
dz migrate           [--project <dir>]
dz stats
dz dashboard
dz doctor            [--project <dir>]
dz roam              [--apply] [--slug <slug>]
dz import-ecc       [--local-path <dir>] [--select id,id,...] [--limit N] [--output <dir>] [--force]
dz help

Targets (5 platforms)

All 5 platforms natively support the agentskills.io SKILL.md format:

Target	Skills directory	Native SKILL.md?
`claude-code`	`.claude/skills/`	Yes
`codex`	`.agents/skills/`	Yes (docs)
`opencode`	`.opencode/skills/`	Yes (also scans `.claude/skills/`)
`hermes`	`.hermes/skills/`	Yes
`openclaude`	`.openclaude/skills/`	Yes

Same SKILL.md file, different directory — no format conversion needed.

Optional platform enrichment (skills work without these):

Platform	Optional extra	What it adds
Codex	`agents/openai.yaml`	UI metadata (icons, display_name, MCP deps)
OpenCode	`opencode.json` + `.opencode/agents/*.md`	Config, custom agents
Hermes	`cli-config.yaml`	Agent config, persona, memory

Workflows (Opus 4.8+ dynamic workflows)

dz workflow --task coverage-lift     # parallel coverage improvement
dz workflow --task mutation-kill     # kill surviving mutants
dz workflow --task canonicalize      # canonicalize new packages
dz workflow --task security-audit    # adversarial security scan

Scout (ecosystem intelligence)

dz scout                              # quick scan — radar mode
dz scout --deep                       # deep analysis — AI analyst mode
dz scout --topics mcp-server,ai-agent # custom topics
dz scout --since 2026-05-01           # only recent repos

Radar mode (dz scout) scans 9 sources in parallel (GitHub + npm + HN + MCP Registry + Glama + OSSInsight + Smithery + Semantic Scholar + arXiv):

Detects skill format — SKILL.md, plugin.json, .claude/skills/, .claude-plugin/, MCP manifests
Scores relevance — format (40%) + stars (30%) + recency (20%) + novelty (10%)
Compares against our 31 packages — finds skills we don't have
Recommends — integrate (score ≥70) / monitor (40-69 + ≥50 stars) / skip

Deep analyst mode (dz scout --deep) goes further for top-scored repos:

Downloads SKILL.md from each repo, parses frontmatter + body
Finds closest match in our inventory by keyword overlap
Explains the delta — what the found skill adds that ours doesn't
Recommends integration path:
- canonicalize — high-signal novel skill → new @dzhechkov/skills-* pack
- merge — similar to existing skill → add unique features to ours
- new-preset — novel skill → add to preset or create new pack
- skip — already in our inventory
Gap analysis — identifies trending categories across the ecosystem that our harness lacks

Example deep analysis output:

## 🔬 Deep Analysis

### cool/agent-toolkit (★500)
2/3 skills are novel

| Skill | Description | Closest match | Integration | Rationale |
|-------|------------|---------------|-------------|-----------|
| code-review | Automated OWASP-focused review | brutal-honesty-review | **merge** | Similar to ours — merge OWASP checklist |
| deploy-check | Pre-deploy validation gates | — | **canonicalize** | High-signal novel skill (500 stars) |

## 📊 Harness Gap Analysis

| Category | Frequency | Recommendation |
|----------|-----------|---------------|
| deploy-automation | 12 repos | Create @dzhechkov/skills-devops — high demand |
| data-pipeline | 5 repos | Monitor — emerging trend |

BTO integration (create-skill --bto)

# Scaffold a new skill with BTO-compatible 3-layer evaluation:
dz create-skill --name my-skill --bto

# What you get:
#   evals/my-skill.yaml       — BTO eval with L0/L1/L2 layers
#   references/judge-rubrics.md — scoring rubrics for 3-judge panel

The --bto flag generates eval templates compatible with /bto-test:

Layer	What	Gate
L0	Deterministic checks (U1-U5 universal + S1-S14 skill-specific)	Pass rate >= 80%
L1	Single LLM judge (Haiku) — 5 dimensions: Clarity, Completeness, Actionability, Quality, Anti-patterns	Average >= 7.0
L2	3-judge panel (Sonnet) — Expert (0.40), Critic (0.30), Auditor (0.30) — 5 dimensions: Methodology, Depth, Correctness, Usability, Robustness	Weighted avg >= 7.0

After scaffolding, fill in the SKILL.md protocol and run /bto-test .claude/skills/my-skill to evaluate.

dz install — install skills from any npm package

# Install skills from any npm package directly
dz install @dzhechkov/skills-devops
dz install @dzhechkov/skills-web3 --target openclaude
dz install @lythos/skill-curator --target claude-code

Runs npm install, discovers SKILL.md files in the package, copies them to the target platform directory. Works with any agentskills.io-compatible npm package.

dz sync-upstream — check for upstream updates

dz sync-upstream --list                                    # show packages with external sources
dz sync-upstream --all                                     # check ALL packages against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one package

Discovers all skill packs with sources.json, fetches SKILL.md from origin repos, reports which skills have upstream changes.

dz upgrade — check installed skills for updates

dz upgrade                           # check .claude/skills/ against canonical
dz upgrade --target openclaude       # check .openclaude/skills/

Compares installed skills with canonical source, reports which need dz init --force to update.

dz downloads — npm weekly download stats

dz downloads     # fetch weekly downloads for all 30 packages

dz benchmark — L0 quality gate

dz benchmark packages/@dzhechkov/skills-devops/terraform     # single skill
dz benchmark packages/@dzhechkov/skills-devops --all          # batch all
dz benchmark skill-a --compare skill-b                        # A/B compare

19 deterministic checks (U1-U5 universal + S1-S14 skill-specific). Grade A = 95%+. For L1/L2 LLM judges, use /bto-test inside Claude Code.

dz publish — automated npm publish

dz publish --dry-run                          # preview what would publish
dz publish --filter skills-devops             # publish specific package
dz publish --filter skills-devops --bump-only # bump version only, no publish

dz auto-canonicalize — discover skills in GitHub repos

dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Scans a GitHub repo for SKILL.md files, generates dz create-skill commands.

dz registry — searchable skill index

dz registry                    # visual panel: 115 skills in 6 categories
dz registry search security    # fuzzy search
dz registry --category mcp     # filter by category

dz stats + dz dashboard

dz stats        # Quick metrics: packages, skills, targets, presets
dz dashboard    # Visual panel with all packages, adapters, skill packs

Example: Thesis Defense Preparation (Academic Preset)

# Install with AgentDB (remembers patterns across students):
dz setup --target claude-code --preset academic --memory agentdb

# Or lightweight:
# dz init --target claude-code --preset academic

Prepare: Create a folder per student with thesis.pdf + review.pdf + external-review.pdf + antiplagiat.pdf.

Pre-defense (open Claude Code in student folder):

"Check document package completeness"     → document-checker
"Analyze this thesis"                     → dissertation-review (format, criteria, grade)
"Generate 6 defense questions"            → question-generator (basic → critical, page refs)

During defense (feed live transcript via Whisper + VB-Cable):

"Analyze this defense transcript"         → defense-evaluator (structure, coverage, delivery)
"Evaluate the student's answers"          → answer-assessor (completeness, depth, reviewer alignment)

When	Skill	What it does
Before	`document-checker`	Package completeness: thesis, reviews, antiplagiat
Before	`dissertation-review`	ГЭК criteria, research/project format, grade 1-10, team project check
Before	`question-generator`	4-6 questions with page refs and expected keywords
During	`defense-evaluator`	Live transcript → structure, coverage, delivery quality
During	`answer-assessor`	Q&A evaluation → completeness, depth, reviewer remarks

Key features: Grade corridor, per-criterion 1-10 scoring, TO BE vs data detection, LTV/CAC > 10 warning, reviewer divergence, raise/lower conditions, compact mode (1-page справка: "компактная справка"), summary table across all students. With AgentDB, patterns persist.

Skills contain only evaluation criteria and methodology — no student data.

Batch mode: S3 archive → agent swarm

# Download and extract: each student = subfolder with .zip
curl -o students.zip "https://s3.example.com/bucket/students.zip"
mkdir students && cd students && 7z x ../students.zip
for f in *.zip; do mkdir -p "${f%.zip}" && cd "${f%.zip}" && 7z x "../$f" && cd ..; done

Then in Claude Code:

"For each student folder: run document-checker → dissertation-review → question-generator.
 Save справка.md per student with clickable inline links to pages (стр. 45, разд. 2.3)
 and external sources ([JTBD](https://hbr.org/...)). Run all students in parallel."

With AgentDB, patterns persist across students — grading calibration improves with each analysis.

Example: Product Discovery with Design Thinking

# With self-learning (recommended — remembers HADI patterns, JTBD insights across sessions):
dz setup --target claude-code --preset meta
dz setup --target claude-code --preset meta --memory agentdb  # + semantic search

# Or without self-learning:
dz init --target claude-code --preset meta
# Or individually:
dz setup --target claude-code --select design-thinking

Then in Claude Code:

"Design a mobile app for booking coworking spaces"
→ design-thinking skill activates
→ 6-phase protocol runs with complexity tier auto-selection

6-Phase Protocol

Phase 1: EMPATHIZE  → STOP gate: request user interview data + goap-research for market data
Phase 2: DEFINE     → JTBD Canvas + CJM AS IS + Ishikawa root cause analysis
Phase 3: IDEATE     → HADI hypotheses + Lean Canvas / Osterwalder BMC + GTM + Unit Economics
Phase 4: PROTOTYPE  → MVP (fidelity spectrum) + CJM/VSM TO BE (labeled as hypotheses)
Phase 5: TEST       → STOP gate: request usability test data + risk analysis + HADI validation
Phase 6: VALIDATE   → Pilot with variance analysis: projected vs actual → Scale/Iterate/Pivot/Kill

Complexity Tiers (auto-selected)

Tier	When	Phases	Integrations
S	Quick user insight	1→2→5	explore + goap-research
M	New feature	1→2→3→4→5	+ frontend-design + six-thinking-hats
L	New product	1→2→3→4→5→6	+ qcsd-swarm + reverse-engineering-unicorn
XL	Platform / ecosystem	All	All optional integrations (aqe init recommended)

Key Safeguards

Never fabricates data — STOP gates pause for real interview/survey/test data
TO BE ≠ data — projections labeled as hypotheses, validated via pilot (Phase 6)
LTV/CAC > 10 flagged as suspicious (Skok 2013)
Loop-back protocol — Phase 5 can invalidate Phase 2 and return upstream
22 methodologies with academic validation tiers (Strong/Moderate/Practitioner/Weak)
12 validation rules (DT-001 through DT-012) enforce quality per tier

What's included vs what's optional

Core DT — the meta preset includes all required dependencies (15 skills):

dz setup --target claude-code --preset meta
# → explore, goap-research-ed25519, problem-solver-enhanced,
#   design-thinking, feature-adr, knowledge-extractor,
#   understand-anything-bridge, ... (15 total)

Full DT — for ALL optional integrations, install agentic-qe:

npm install -g agentic-qe && aqe init --auto
# → 94 QE skills + 55 agents in .claude/skills/ and .claude/agents/
# → six-thinking-hats, qcsd-ideation-swarm, frontend-design, brutal-honesty-review

Or cherry-pick: dz compose meta+keysarium for competitive analysis.

Optional Skill	Source	What it adds
`frontend-design`	`aqe init` / `keysarium`	HTML/React prototypes (Phase 4)
`six-thinking-hats`	`aqe init`	Team ideation (Phase 3)
`qcsd-ideation-swarm`	`aqe init`	9-agent quality risk (Phase 2-3)
`reverse-engineering-unicorn`	`keysarium`	Competitor CJM+JTBD (Phase 1)

Without optional skills, design-thinking uses built-in fallbacks.

BTO benchmark: L0 Grade A (100%), L2 Opus weighted 7.58/10.

Example: Import Skills from ECC

dz install @dzhechkov/skills-ecc                 # 20 curated ECC skills
dz import-ecc --limit 50                         # import 50 from GitHub
dz import-ecc --local-path /path/to/ECC          # from local clone (fast)
dz import-ecc --select docker-patterns,tdd       # cherry-pick

Example: Security Scan with AgentShield

# In Claude Code: "scan my agent config for security issues"
# → agentshield-scan skill activates (170 rules, 10 categories)
npx ecc-agentshield scan --format sarif           # SARIF for GitHub Code Scanning

Example: 4-Axis Risk Scoring

dz init --target codex --preset meta --enrich
# → agents/openai.yaml includes risk_level per skill
# Axes: base_tool + file_sensitivity + blast_radius + irreversibility

Example: Understand & Develop an Existing Project

# 1. Analyze project → get recommendations
dz pretrain                                     # detects stack, recommends presets
dz recommend "work on this Node.js API"         # suggests skills + toolkits

# 2. Install skills (choose your level)
dz setup --target claude-code --preset meta --memory agentdb  # 15 skills (includes feature-adr)
dz setup --target claude-code --preset qe-engineer             # + 20 QE skills

# Want the full feature-adr toolkit with /feature-adr command + governance?
npx @dzhechkov/skills-feature-adr init                         # adds slash command + rules + shards
# See: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr

# preset = SKILL.md only (auto-activates on matching tasks)
# npx = full toolkit (slash command + governance + rules)

Install Understand-Anything plugin, then in Claude Code:

# 3. Map the codebase
/understand                                      # builds knowledge graph
# → understand-anything-bridge feeds architecture context to all skills

# 4. Develop with full context
"Add a payment module"
# → feature-adr runs with architecture awareness (layers, hot spots, dependencies)
# → see: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr
# → code generation informed by real dependency graph
# → QE review targets tests at high-impact files
# → agentshield-scan checks new configs for security

# 5. Verify impact
"What files are affected by my changes?"
# → blast radius calculation → targeted test generation

Architecture-aware development: every skill knows the codebase structure.

Example: AI-Assisted Reasoning & Self-Improvement

# Auto-select reasoning strategy:
"Compare 3 architectures"      → structured-reasoning: Tree-of-Thought (branches + scoring)
"Debug this test"              → structured-reasoning: Chain-of-Thought (linear trace)
"We've been looping"           → structured-reasoning: Reflection-Suppression (break loop)

# Self-review before delivering:
"Write a migration and verify" → reflection-loop: draft → critique → revise (max 3 rounds)

# Manage long sessions:
"Context is getting long"      → context-window-management: checkpoint + prune + continue

# Learn from success:
"Extract this as a skill"      → skill-crystallizer: trace → reusable SKILL.md

All included in meta preset.

Self-Learning: JSONL vs AgentDB

DZ Harness supports two memory backends for self-learning:

dz setup --target claude-code --preset devops                    # JSONL (default, lightweight)
dz setup --target claude-code --preset devops --memory agentdb   # AgentDB (vector memory)

Capability	JSONL (default)	AgentDB (`--memory agentdb`)
Session tracking	Append-only JSONL log	HNSW vector store (.rvf)
Pattern storage	`dz teach` → patterns.jsonl	`dz teach` → .rvf + agentdb_pattern_store
Search	Keyword (grep)	Semantic (HNSW nearest-neighbor, cosine similarity)
Retrieval	Sequential scan	O(log n) approximate nearest neighbor
Self-learning	Frequency-based	9 RL algorithms + Thompson Sampling bandit
Memory tiers	Flat file	3-tier (working → short-term → long-term)
Reflexion	Reward scores (0-1)	Episodic memory (task + outcome + self-critique)
Causal reasoning	No	Cypher-like graph queries (X caused Y)
Skill composition	Manual (presets)	Bandit-picked skill chains (A→B→C)
Audit trail	No	Cryptographic attestation log
Size	~0 KB	4.6 MB (agentdb)
MCP tools	0	41 tools (pattern, reflexion, causal, skill, hierarchy)
Dependencies	None	agentdb (optional, via npx)

AgentDB self-learning algorithms

When using --memory agentdb, the following algorithms automatically tune search quality:

Thompson Sampling — multi-armed bandit for ranking search results
UCB1 (Upper Confidence Bound) — exploration-exploitation balancing
EXP3 — adversarial bandit for non-stationary environments
Softmax — temperature-based action selection
Epsilon-Greedy — simple exploration with decay
Gradient Bandit — preference-based action selection
Contextual Bandit — context-aware ranking using features
REINFORCE — policy gradient for complex reward landscapes
PPO-lite — proximal policy optimization for stable learning

The bandit automatically selects the best algorithm for your usage pattern — no manual tuning needed.

How to enable AgentDB

# One command — everything is set up:
dz setup --target claude-code --preset devops --memory agentdb

This creates .dz/memory.rvf, registers the agentdb MCP server (41 tools), and configures session hooks. The agent can immediately use agentdb_pattern_store, agentdb_reflexion_recall, etc. — no additional dz init needed.

Command	When to use
`dz setup --memory agentdb`	Recommended — full setup in one step
`dz init --select agentdb-memory`	Lightweight — only the SKILL.md guide (see below)

What does `dz init --select agentdb-memory` actually do?

This is the lightweight path — it installs only the skill documentation, without configuring the backend:

Step 1: Auto-discovers agentdb-memory/ in skills-mcp package
Step 2: Copies to .claude/skills/agentdb-memory/
          ├── SKILL.md              ← instructions for the agent
          ├── schemas/output.json
          ├── scripts/validate-config.json
          └── evals/agentdb-memory.yaml

Step 3: Claude Code auto-discovers the skill from .claude/skills/
Step 4: When agent encounters a matching task, it reads SKILL.md
Step 5: SKILL.md teaches the agent WHICH tools to call and WHEN

What it does NOT do (unlike dz setup --memory agentdb):

Does NOT create .dz/memory.rvf
Does NOT register agentdb MCP server
Does NOT configure session hooks

After dz init --select agentdb-memory, the user must manually add the MCP server:

claude mcp add agentdb -- npx agentdb@latest mcp start

When this is useful:

You already have agentdb installed separately and just want the skill guide
You want to teach the agent about agentdb tools without committing to the full .dz/ infrastructure
You're in a team where agentdb is managed centrally but each developer needs the skill docs

How it works

dz init compiles canonical skills from the agentskills.io standard into the target platform's layout
Writing is additive — existing files are never overwritten without --force
All 5 platform adapters produce byte-identical output (ADR-005)
dz doctor runs 7 health checks (node version, adapters, config, SQLite, skills)
dz migrate detects legacy keysarium/bto installations and recommends migration path

Use Cases

1. Short-term product research (one-off study)

Goal: Quickly research a product idea, competitors, market — get a structured report.

# Option A: via dz CLI
dz init --target claude-code --preset meta
# Then in Claude Code:
#   /explore "Research the market for AI-powered code review tools"
#   /feature-adr "Summarize findings into an ADR"

# Option B: via keysarium (full 7-phase pipeline)
npx @dzhechkov/keysarium init
# Then in Claude Code:
#   /casarium "AI-powered code review tools — market analysis"
#   → Phase 0: Discovery → Phase 1: Exploration → Phase 2: Paranoid Research
#   → Phase 3: Solution Design → Phase 4: Architecture → Phase 5: Presentation

What you get:

meta preset: /explore clarifies the problem → /feature-adr structures findings as ADR decisions
keysarium: full 7-phase pipeline with dream cycles, background workers, and presentation generation

Best for: Quick study (hours), competitive analysis, technology evaluation.

2. Long-term product research (evolving over time)

Goal: Continuously gather data, add new sources, and "recalculate" the product vision as insights accumulate.

# Install keysarium (research pipeline) + evidence-wiki (knowledge base)
npx @dzhechkov/keysarium init
# Copy evidence-wiki plugin into your project:
npx @dzhechkov/evidence-wiki   # or git clone https://github.com/djd1m/evidence-wiki

npm install -g @dzhechkov/harness-cli
dz init --target claude-code --preset meta

Workflow — iterative research cycles with evidence wiki:

Week 1:  /casarium "Product X — initial research"
         → researches/ directory created with findings
         → .keysarium/memory/ stores patterns + reward scores

         /wiki-generate                              ← evidence-wiki
         → Scans researches/, ADRs, docs
         → Generates wiki/concepts/*.md (atomic pages with inline sources)
         → Builds wiki/graph.json (knowledge graph)
         → wiki/INDEX.md links everything

Week 2:  Add new data → /casarium "Product X — update with Q2 metrics"
         → Memory recalls Week 1 patterns (reward-calibrated learning)
         → New findings merged with existing, conflicts resolved

         /wiki-generate --check                      ← re-generates wiki
         → New concepts added, existing updated
         → Every claim verified: triple-pillar protocol requires N independent
           typed sources (ADR + methodology + research)
         → Stale concepts flagged, broken evidence links detected

         /triple-check wiki/concepts/pricing-model.md ← verify specific page
         → Checks that every factual claim has inline source citations
         → Flags unsupported statements

Week N:  /casarium "Product X — pivot analysis after customer feedback"
         → Full history in memory layer + evidence wiki
         → /harvest extracts reusable knowledge patterns
         → /wiki-generate rebuilds the entire knowledge graph
         → Product vision "recalculated" — the wiki IS the living product model

The evidence-wiki advantage:

Without evidence-wiki	With evidence-wiki
Research in markdown files	Atomic concept pages with inline sources
Findings scattered across `researches/`	Interlinked knowledge graph (`graph.json`)
"I think we decided X"	Every claim has a cited source (triple-pillar)
Hard to see what changed	`/wiki-generate --check` diffs the knowledge base
No verification	`/triple-check` enforces evidence discipline

Key features for long-term research:

Evidence wiki (@dzhechkov/evidence-wiki): atomic concept pages where every factual claim carries inline sources; knowledge graph for cross-referencing; triple-pillar protocol (N independent typed sources per claim)
Reward-calibrated memory (@dzhechkov/memory Reflexion): each checkpoint response trains the system — "ок" = excellent (1.0), feedback = good (0.7), rework = needs_work (0.3)
Agent SDK Dreaming: between sessions, patterns are consolidated and distilled
/harvest (knowledge-extractor skill): extracts reusable patterns from completed research into lib/ templates
SQLite + FTS5 backend: scales to 100k+ records with full-text search across all research sessions

Best for: Product strategy over months, continuous market monitoring, evolving product vision with evidence-backed decisions.

3. Product research + working prototype

Goal: Research the product AND build a functional prototype.

Option A: Sequential — research first, then code

# Step 1: Install research + development presets
npx @dzhechkov/keysarium init
# OR:
dz init --target claude-code --preset keysarium

# Step 2: Research phase
#   /casarium "SaaS platform for team retrospectives"
#   → Phase 0-2: Discovery, Exploration, Paranoid Research
#   → Phase 3: Solution Design (with CJM prototype)
#   → Result: researches/<slug>/ with full analysis

# Step 3: Switch to development
dz init --target claude-code --preset feature-adr

# Step 4: Build using research outputs
#   /feature-adr "Build the retrospective platform based on research in researches/<slug>/"
#   → Step 0: Router classifies as L/XL
#   → Step 1-5: Requirements, ADRs, DDD, Architecture (informed by research)
#   → Step 6: Implementation plan
#   → Step 7: Code generation (with /frontend-design for UI)
#   → Step 8-9: QE review + fleet assessment

What you get: Research artifacts in researches/, then code in features/<slug>/ + actual repository changes. Research directly feeds into ADR decisions.

Option B: Parallel — research and code simultaneously with p-replicator

# Install the full product development toolkit
npx @dzhechkov/p-replicator init

# Single pipeline: research → requirements → prototype
#   /replicate "SaaS platform for team retrospectives"
#   → Reverse-engineers similar products (reverse-engineering-unicorn)
#   → Generates SPARC PRD (sparc-prd-mini)
#   → Validates requirements (requirements-validator)
#   → Creates the project structure (pipeline-forge)
#   → Builds the prototype (cc-toolkit-generator-enhanced)
#   → Reviews with brutal honesty (brutal-honesty-review)

What you get: A working prototype generated from research in a single /replicate pipeline run. Faster but less deep than Option A.

Comparison

Aspect	Option A (Sequential)	Option B (p-replicator)
Research depth	Deep (7-phase keysarium)	Moderate (reverse-engineering)
Code quality	High (11-step feature-adr + QE)	Good (pipeline-forge + review)
Time	Days to weeks	Hours to days
Best for	Complex products, regulated domains	MVPs, hackathons, quick validation
Packages	`keysarium` + `feature-adr` preset	`p-replicator`
Research artifacts	`researches/` directory	Embedded in PRD
Code artifacts	`features/<slug>/` + repo changes	Generated project

Tip: For maximum rigor, combine both — use p-replicator for a quick prototype, then run /feature-adr --full-qe-extended on the generated code for production-grade quality engineering.

Status

v0.3.85 — published on npm. Also available as Claude Plugin. Part of DZ Harness Hub.

Claude Plugin

DZ Harness Hub is available as a Claude Code plugin:

# Via marketplace (when published):
claude plugin marketplace add djd1m/dz-harness-hub
claude plugin install dz-harness-hub@dz-harness-hub

# Or test locally:
claude --plugin-dir /path/to/dz-harness-hub

# Generate plugin manifest from current inventory:
dz plugin --version 0.3.85

The .claude-plugin/ directory contains plugin.json + marketplace.json compatible with pi-claude-marketplace and skill-hub.

Skill sources

agentic-qe — 20 QE skills + 55 agents (test generation, coverage, chaos, QCSD swarms)
ECC — 20 curated skills (agent patterns, autonomous loops, docker, git workflows)
AgentShield — Security scanning (170 rules for .claude/ configs)
Understand-Anything — Codebase knowledge graph → architecture context

Platform & infrastructure

AgentDB — Self-learning vector memory (--memory agentdb, 41 MCP tools)
agentskills.io — Open standard for SKILL.md format (adopted by all 5 platforms)
OpenAI Codex — 2nd target platform
OpenCode — 3rd target platform (160K+ stars)
Hermes Agent — 4th target platform
OpenClaude — 5th target platform (28K+ stars)