JSPM

  • Created
  • Published
  • Downloads 1369
  • Score
    100M100P100Q132953F
  • License MIT

The dz CLI — install AI skills for Claude Code, Codex, OpenCode, Hermes. 11 commands, 7 presets, 4 platform adapters.

Package Exports

  • @dzhechkov/harness-cli

Readme

@dzhechkov/harness-cli

The dz CLI — the main entry point to the DZ Harness Hub. Install AI skills for Claude Code, Codex, OpenCode, Hermes, OpenClaude from a single command.

Install

npm install -g @dzhechkov/harness-cli

# Update to latest version (run from outside any workspace project):
cd /tmp && npm install -g @dzhechkov/harness-cli@latest

Note: If you get EUNSUPPORTEDPROTOCOL workspace:*, you're inside a pnpm/yarn workspace. Run the install from /tmp or ~ instead.

User Journey — from install to mastery

All 31 commands mapped to a real workflow:

DISCOVER → INSTALL → USE → CREATE → MAINTAIN → SHARE

Phase 1: Discover (what's available?)

npm install -g @dzhechkov/harness-cli    # install the CLI

dz help                                   # see all commands
dz pretrain                                # analyze project files → recommend by tech stack
dz recommend "build API and deploy to K8s" # keyword match → skills + toolkits
dz recommend "work on this project"        # generic? → auto-runs pretrain → recommends by stack
dz stats                                  # 31 packages, 115 skills, 5 targets, 11 presets
dz dashboard                              # visual panel — packages, adapters, skill packs
dz registry                               # browse all 115 skills by category
dz registry search kubernetes             # find specific skills
dz registry --category devops             # filter by domain
dz downloads                              # npm weekly download stats

Phase 2: Install (set up your workspace)

# Full setup with self-learning (recommended):
dz setup --target claude-code --preset devops  # pretrain + hooks + JSONL memory

# With AgentDB vector memory (semantic search + self-learning):
dz setup --target claude-code --preset devops --memory agentdb  # .rvf + 41 MCP tools

# Or just install skills (no learning):
dz init --target claude-code --preset devops   # 27 DevOps skills
dz init --target openclaude --preset web3      # 12 DeFi skills for OpenClaude
dz init --target codex --preset mcp            # 16 MCP skills for Codex

# Or pick individual skills:
dz init --target claude-code --select terraform,kubernetes,docker-compose

# Or install from any npm package:
dz install @dzhechkov/skills-devops            # npm install + copy skills

# Verify everything is correct:
dz verify                                       # structural validation
dz doctor                                       # 7 health checks
dz list                                         # show installed skills
dz info --id terraform                          # detailed info about a skill

Phase 3: Use (work with your agent)

# Now use Claude Code / Codex / OpenCode / Hermes normally.
# Skills are auto-discovered from the platform's skills directory.
# Example in Claude Code:
#   "Review this PR" → pr-review skill activates
#   "Design an API" → api-design skill activates
#   "Fix this CI" → ci-fix skill activates

Phase 4: Create (build your own skills)

# Scaffold a new skill:
dz create-skill --name my-skill --description "What it does" --tier 2

# With BTO-compatible eval templates:
dz create-skill --name my-skill --bto

# Benchmark your skill (aim for Grade A):
dz benchmark .claude/skills/my-skill           # single skill — 19 L0 checks
dz benchmark packages/@dzhechkov/skills-devops --all   # batch all
dz benchmark skill-a --compare skill-b          # A/B compare

# Find skills to canonicalize from the ecosystem:
dz scout                                        # scan 9 sources (GitHub, npm+plugins, HN, ...)
dz scout --deep                                 # deep analysis with SKILL.md parsing
dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Phase 5: Maintain (keep skills fresh)

# Check for upstream changes (canonicalized skills):
dz sync-upstream --list                                 # which packages have external sources?
dz sync-upstream --all                                  # check all against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one

# Check installed skills vs canonical:
dz upgrade                                      # shows which skills need update
dz upgrade --target openclaude                  # check specific platform

# Sync canonical to legacy layout:
dz sync                                         # canonical → project skills
dz migrate                                      # detect legacy installations

# Orchestrate dynamic workflows:
dz workflow --task coverage-lift                 # parallel coverage improvement
dz workflow --task security-audit               # adversarial security scan

# Cross-host state sync:
dz roam --apply                                 # sync agent state across machines

Phase 6: Share (publish to the world)

# Publish updated packages to npm:
dz publish --dry-run                            # preview
dz publish --filter skills-devops               # publish specific package
dz publish                                      # publish all changed packages

Three Ways to Install Skills

Individual Skill Preset npx Package
What 1 SKILL.md file Curated list of skill names Full toolkit with orchestration
Contains Instructions for 1 task N skill references Skills + commands + rules + shards + agents + memory
Pipeline No No Yes (phases, checkpoints, governance)
Self-learning No dz setup adds it Built-in
Install dz init --select X dz setup --preset X npx @dzhechkov/X init
Example terraform devops (27 skills) keysarium (7-phase research)
# One skill:
dz init --target claude-code --select design-thinking

# Curated set by topic (recommended):
dz setup --target claude-code --preset meta          # 15 development skills + self-learning

# Full toolkit with orchestrated pipeline:
npx @dzhechkov/keysarium init                        # 7-phase research + commands + memory

When to use which:

  • Need 1 specific capability--select
  • Need a themed set that works together → --preset
  • Need a full pipeline with commands and governance → npx

Available Presets (11)

Preset Skills Description
meta 15 Development process (explore, goap-research, problem-solver, design-thinking, feature-adr, knowledge-extractor, understand-anything-bridge, agentshield-scan, adversarial-verifier)
qe-engineer 20 Quality engineering (test-gen, coverage, chaos, defect, ...)
bto 1 Build-Benchmark-Test-Optimize pipeline
health 8 Medical AI (diagnostics, drugs, labs, clinical decisions)
keysarium 9 Full research toolkit (feature-adr, presentation, reverse-eng)
p-replicator 10 AI product development (/replicate, SPARC PRD, pipeline-forge)
feature-adr 5 Feature pipeline (feature-adr, explore, frontend-design)
devops 27 DevOps skills (terraform, kubernetes, c4-architecture, risk-assessment, ...)
web3 12 Web3/DeFi (quicknode, zerion, symbiosis, bankr, veil, neynar, ...)
mcp 16 MCP servers (agentdb, brave-search, gmail, gitlab, comfyui, notion, ...)
academic 5 Thesis defense (review, questions, doc-check, live defense + answer eval)

Standalone Packages (install via npx, no dz CLI needed)

Package Install What it does
@dzhechkov/keysarium npx @dzhechkov/keysarium init Full 7-phase research toolkit
@dzhechkov/design-thinking npx @dzhechkov/design-thinking init d.school 6-phase Design Thinking (8 skills)
@dzhechkov/p-replicator npx @dzhechkov/p-replicator init AI product development (/replicate pipeline)
@dzhechkov/health-advisor npx @dzhechkov/health-advisor init Medical AI (25 skills)
@dzhechkov/skills-bto npx @dzhechkov/skills-bto init BTO benchmarking (Build-Test-Optimize)
@dzhechkov/skills-feature-adr npx @dzhechkov/skills-feature-adr init 11-step feature pipeline
@dzhechkov/skills-edu-site npx @dzhechkov/skills-edu-site init Gamified edu site generator
@dzhechkov/skills-transcript-site npx @dzhechkov/skills-transcript-site init Transcript → interactive site
@dzhechkov/skills-analyst-manual npx @dzhechkov/skills-analyst-manual init 3-phase analyst composite

Difference: dz init --preset installs individual skills from .claude/skills/ source into a target platform tree. Standalone npx packages have their own CLI and install a complete toolkit with commands, rules, shards, and agents — a richer but self-contained experience.

A skill and its npx toolkit are not duplicates — they're a graduation. Several skills (e.g. feature-adr, design-thinking) exist BOTH as a skill inside a dz preset AND as a standalone npx package. The preset's SKILL.md is fully functional on its own (the whole methodology — modules + references — travels with it, and it auto-activates by description), and it's the only way to compile that capability to the non-Claude platforms (Codex/OpenCode/Hermes/OpenClaude) via dz. The npx package adds project-level runtime governance around the same skill: a slash command, governance rules, a context shard, and (for feature-adr) reward-learning + /harvest. So: pick the skill/preset for a working capability across platforms; pick the npx toolkit when you want it as a governed, command-driven fixture of one project.

All Commands (31)

dz setup             --target <name> [--preset <name>] [--memory agentdb] [--no-hooks] [--install-driver] [--force]
dz init              --target <name> [--preset <name>] [--select id,id,...] [--force]
dz install           <npm-pkg> [--target <name>] [--project <dir>]
dz teach             "<pattern>" [--reward <0-1>] [--domain <name>]
dz pretrain          [--project <dir>]
dz recommend         "<task description>"
dz compose           <preset1+preset2+...> [--target <name>]
dz diff              <skill-dir>
dz upgrade           [--target <name>] [--project <dir>]
dz verify            [--skills-dir <dir>] [--target <name>]
dz sync              [--canonical <dir>] [--project <dir>] [--dry-run] [--force]
dz update            (alias for sync)
dz list              [--skills-dir <dir>]
dz info              --id <skill-id> [--skills-dir <dir>]
dz create-skill      --name <id> [--description <text>] [--tier 1|2|3] [--bto]
dz registry          [search <query>] [--category <cat>]
dz benchmark         <skill-dir> [--compare <dir>] [--all]
dz publish           [--filter <name>] [--dry-run] [--bump-only]
dz auto-canonicalize --source <github-url> --pack <skills-pack>
dz sync-upstream     [--package <dir>] [--list] [--all]
dz scout             [--topics <list>] [--since <date>] [--deep]
dz workflow          --task <name> [--dry-run]
dz plugin            [--version <ver>]
dz downloads
dz migrate           [--project <dir>]
dz stats
dz dashboard
dz doctor            [--project <dir>]
dz roam              [--apply] [--slug <slug>]
dz import-ecc       [--local-path <dir>] [--select id,id,...] [--limit N] [--output <dir>] [--force]
dz help

Targets (5 platforms)

All 5 platforms natively support the agentskills.io SKILL.md format:

Target Skills directory Native SKILL.md?
claude-code .claude/skills/ Yes
codex .agents/skills/ Yes (docs)
opencode .opencode/skills/ Yes (also scans .claude/skills/)
hermes .hermes/skills/ Yes
openclaude .openclaude/skills/ Yes

Same SKILL.md file, different directory — no format conversion needed.

Optional platform enrichment (skills work without these):

Platform Optional extra What it adds
Codex agents/openai.yaml UI metadata (icons, display_name, MCP deps)
OpenCode opencode.json + .opencode/agents/*.md Config, custom agents
Hermes cli-config.yaml Agent config, persona, memory

Workflows (Opus 4.8+ dynamic workflows)

dz workflow --task coverage-lift     # parallel coverage improvement
dz workflow --task mutation-kill     # kill surviving mutants
dz workflow --task canonicalize      # canonicalize new packages
dz workflow --task security-audit    # adversarial security scan

Scout (ecosystem intelligence)

dz scout                              # quick scan — radar mode
dz scout --deep                       # deep analysis — AI analyst mode
dz scout --topics mcp-server,ai-agent # custom topics
dz scout --since 2026-05-01           # only recent repos

Radar mode (dz scout) scans 9 sources in parallel (GitHub + npm + HN + MCP Registry + Glama + OSSInsight + Smithery + Semantic Scholar + arXiv):

  1. Detects skill format — SKILL.md, plugin.json, .claude/skills/, .claude-plugin/, MCP manifests
  2. Scores relevance — format (40%) + stars (30%) + recency (20%) + novelty (10%)
  3. Compares against our 31 packages — finds skills we don't have
  4. Recommends — integrate (score ≥70) / monitor (40-69 + ≥50 stars) / skip

Deep analyst mode (dz scout --deep) goes further for top-scored repos:

  1. Downloads SKILL.md from each repo, parses frontmatter + body
  2. Finds closest match in our inventory by keyword overlap
  3. Explains the delta — what the found skill adds that ours doesn't
  4. Recommends integration path:
    • canonicalize — high-signal novel skill → new @dzhechkov/skills-* pack
    • merge — similar to existing skill → add unique features to ours
    • new-preset — novel skill → add to preset or create new pack
    • skip — already in our inventory
  5. Gap analysis — identifies trending categories across the ecosystem that our harness lacks

Example deep analysis output:

## 🔬 Deep Analysis

### cool/agent-toolkit (★500)
2/3 skills are novel

| Skill | Description | Closest match | Integration | Rationale |
|-------|------------|---------------|-------------|-----------|
| code-review | Automated OWASP-focused review | brutal-honesty-review | **merge** | Similar to ours — merge OWASP checklist |
| deploy-check | Pre-deploy validation gates | — | **canonicalize** | High-signal novel skill (500 stars) |

## 📊 Harness Gap Analysis

| Category | Frequency | Recommendation |
|----------|-----------|---------------|
| deploy-automation | 12 repos | Create @dzhechkov/skills-devops — high demand |
| data-pipeline | 5 repos | Monitor — emerging trend |

Powered by @dzhechkov/scout.

BTO integration (create-skill --bto)

# Scaffold a new skill with BTO-compatible 3-layer evaluation:
dz create-skill --name my-skill --bto

# What you get:
#   evals/my-skill.yaml       — BTO eval with L0/L1/L2 layers
#   references/judge-rubrics.md — scoring rubrics for 3-judge panel

The --bto flag generates eval templates compatible with /bto-test:

Layer What Gate
L0 Deterministic checks (U1-U5 universal + S1-S14 skill-specific) Pass rate >= 80%
L1 Single LLM judge (Haiku) — 5 dimensions: Clarity, Completeness, Actionability, Quality, Anti-patterns Average >= 7.0
L2 3-judge panel (Sonnet) — Expert (0.40), Critic (0.30), Auditor (0.30) — 5 dimensions: Methodology, Depth, Correctness, Usability, Robustness Weighted avg >= 7.0

After scaffolding, fill in the SKILL.md protocol and run /bto-test .claude/skills/my-skill to evaluate.

dz install — install skills from any npm package

# Install skills from any npm package directly
dz install @dzhechkov/skills-devops
dz install @dzhechkov/skills-web3 --target openclaude
dz install @lythos/skill-curator --target claude-code

Runs npm install, discovers SKILL.md files in the package, copies them to the target platform directory. Works with any agentskills.io-compatible npm package.

dz sync-upstream — check for upstream updates

dz sync-upstream --list                                    # show packages with external sources
dz sync-upstream --all                                     # check ALL packages against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one package

Discovers all skill packs with sources.json, fetches SKILL.md from origin repos, reports which skills have upstream changes.

dz upgrade — check installed skills for updates

dz upgrade                           # check .claude/skills/ against canonical
dz upgrade --target openclaude       # check .openclaude/skills/

Compares installed skills with canonical source, reports which need dz init --force to update.

dz downloads — npm weekly download stats

dz downloads     # fetch weekly downloads for all 30 packages

dz benchmark — L0 quality gate

dz benchmark packages/@dzhechkov/skills-devops/terraform     # single skill
dz benchmark packages/@dzhechkov/skills-devops --all          # batch all
dz benchmark skill-a --compare skill-b                        # A/B compare

19 deterministic checks (U1-U5 universal + S1-S14 skill-specific). Grade A = 95%+. For L1/L2 LLM judges, use /bto-test inside Claude Code.

dz publish — automated npm publish

dz publish --dry-run                          # preview what would publish
dz publish --filter skills-devops             # publish specific package
dz publish --filter skills-devops --bump-only # bump version only, no publish

dz auto-canonicalize — discover skills in GitHub repos

dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Scans a GitHub repo for SKILL.md files, generates dz create-skill commands.

dz registry — searchable skill index

dz registry                    # visual panel: 115 skills in 6 categories
dz registry search security    # fuzzy search
dz registry --category mcp     # filter by category

dz stats + dz dashboard

dz stats        # Quick metrics: packages, skills, targets, presets
dz dashboard    # Visual panel with all packages, adapters, skill packs

Example: Thesis Defense Preparation (Academic Preset)

# Install with AgentDB (remembers patterns across students):
dz setup --target claude-code --preset academic --memory agentdb

# Or lightweight:
# dz init --target claude-code --preset academic

Prepare: Create a folder per student with thesis.pdf + review.pdf + external-review.pdf + antiplagiat.pdf.

Pre-defense (open Claude Code in student folder):

"Check document package completeness"     → document-checker
"Analyze this thesis"                     → dissertation-review (format, criteria, grade)
"Generate 6 defense questions"            → question-generator (basic → critical, page refs)

During defense (feed live transcript via Whisper + VB-Cable):

"Analyze this defense transcript"         → defense-evaluator (structure, coverage, delivery)
"Evaluate the student's answers"          → answer-assessor (completeness, depth, reviewer alignment)
When Skill What it does
Before document-checker Package completeness: thesis, reviews, antiplagiat
Before dissertation-review ГЭК criteria, research/project format, grade 1-10, team project check
Before question-generator 4-6 questions with page refs and expected keywords
During defense-evaluator Live transcript → structure, coverage, delivery quality
During answer-assessor Q&A evaluation → completeness, depth, reviewer remarks

Key features: Grade corridor, per-criterion 1-10 scoring, TO BE vs data detection, LTV/CAC > 10 warning, reviewer divergence, raise/lower conditions, compact mode (1-page справка: "компактная справка"), summary table across all students. With AgentDB, patterns persist.

Skills contain only evaluation criteria and methodology — no student data.

Batch mode: S3 archive → agent swarm

# Download and extract: each student = subfolder with .zip
curl -o students.zip "https://s3.example.com/bucket/students.zip"
mkdir students && cd students && 7z x ../students.zip
for f in *.zip; do mkdir -p "${f%.zip}" && cd "${f%.zip}" && 7z x "../$f" && cd ..; done

Then in Claude Code:

"For each student folder: run document-checker → dissertation-review → question-generator.
 Save справка.md per student with clickable inline links to pages (стр. 45, разд. 2.3)
 and external sources ([JTBD](https://hbr.org/...)). Run all students in parallel."

With AgentDB, patterns persist across students — grading calibration improves with each analysis.


Example: Product Discovery with Design Thinking

# With self-learning (recommended — remembers HADI patterns, JTBD insights across sessions):
dz setup --target claude-code --preset meta
dz setup --target claude-code --preset meta --memory agentdb  # + semantic search

# Or without self-learning:
dz init --target claude-code --preset meta
# Or individually:
dz setup --target claude-code --select design-thinking

Then in Claude Code:

"Design a mobile app for booking coworking spaces"
→ design-thinking skill activates
→ 6-phase protocol runs with complexity tier auto-selection

6-Phase Protocol

Phase 1: EMPATHIZE  → STOP gate: request user interview data + goap-research for market data
Phase 2: DEFINE     → JTBD Canvas + CJM AS IS + Ishikawa root cause analysis
Phase 3: IDEATE     → HADI hypotheses + Lean Canvas / Osterwalder BMC + GTM + Unit Economics
Phase 4: PROTOTYPE  → MVP (fidelity spectrum) + CJM/VSM TO BE (labeled as hypotheses)
Phase 5: TEST       → STOP gate: request usability test data + risk analysis + HADI validation
Phase 6: VALIDATE   → Pilot with variance analysis: projected vs actual → Scale/Iterate/Pivot/Kill

Complexity Tiers (auto-selected)

Tier When Phases Integrations
S Quick user insight 1→2→5 explore + goap-research
M New feature 1→2→3→4→5 + frontend-design + six-thinking-hats
L New product 1→2→3→4→5→6 + qcsd-swarm + reverse-engineering-unicorn
XL Platform / ecosystem All All optional integrations (aqe init recommended)

Key Safeguards

  • Never fabricates data — STOP gates pause for real interview/survey/test data
  • TO BE ≠ data — projections labeled as hypotheses, validated via pilot (Phase 6)
  • LTV/CAC > 10 flagged as suspicious (Skok 2013)
  • Loop-back protocol — Phase 5 can invalidate Phase 2 and return upstream
  • 22 methodologies with academic validation tiers (Strong/Moderate/Practitioner/Weak)
  • 12 validation rules (DT-001 through DT-012) enforce quality per tier

What's included vs what's optional

Core DT — the meta preset includes all required dependencies (15 skills):

dz setup --target claude-code --preset meta
# → explore, goap-research-ed25519, problem-solver-enhanced,
#   design-thinking, feature-adr, knowledge-extractor,
#   understand-anything-bridge, ... (15 total)

Full DT — for ALL optional integrations, install agentic-qe:

npm install -g agentic-qe && aqe init --auto
# → 94 QE skills + 55 agents in .claude/skills/ and .claude/agents/
# → six-thinking-hats, qcsd-ideation-swarm, frontend-design, brutal-honesty-review

Or cherry-pick: dz compose meta+keysarium for competitive analysis.

Optional Skill Source What it adds
frontend-design aqe init / keysarium HTML/React prototypes (Phase 4)
six-thinking-hats aqe init Team ideation (Phase 3)
qcsd-ideation-swarm aqe init 9-agent quality risk (Phase 2-3)
reverse-engineering-unicorn keysarium Competitor CJM+JTBD (Phase 1)

Without optional skills, design-thinking uses built-in fallbacks.

BTO benchmark: L0 Grade A (100%), L2 Opus weighted 7.58/10.


Example: Import Skills from ECC

dz install @dzhechkov/skills-ecc                 # 20 curated ECC skills
dz import-ecc --limit 50                         # import 50 from GitHub
dz import-ecc --local-path /path/to/ECC          # from local clone (fast)
dz import-ecc --select docker-patterns,tdd       # cherry-pick

Example: Security Scan with AgentShield

# In Claude Code: "scan my agent config for security issues"
# → agentshield-scan skill activates (170 rules, 10 categories)
npx ecc-agentshield scan --format sarif           # SARIF for GitHub Code Scanning

Example: 4-Axis Risk Scoring

dz init --target codex --preset meta --enrich
# → agents/openai.yaml includes risk_level per skill
# Axes: base_tool + file_sensitivity + blast_radius + irreversibility

Example: Understand & Develop an Existing Project

# 1. Analyze project → get recommendations
dz pretrain                                     # detects stack, recommends presets
dz recommend "work on this Node.js API"         # suggests skills + toolkits

# 2. Install skills (choose your level)
dz setup --target claude-code --preset meta --memory agentdb  # 15 skills (includes feature-adr)
dz setup --target claude-code --preset qe-engineer             # + 20 QE skills

# Want the full feature-adr toolkit with /feature-adr command + governance?
npx @dzhechkov/skills-feature-adr init                         # adds slash command + rules + shards
# See: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr

# preset = SKILL.md only (auto-activates on matching tasks)
# npx = full toolkit (slash command + governance + rules)

Install Understand-Anything plugin, then in Claude Code:

# 3. Map the codebase
/understand                                      # builds knowledge graph
# → understand-anything-bridge feeds architecture context to all skills

# 4. Develop with full context
"Add a payment module"
# → feature-adr runs with architecture awareness (layers, hot spots, dependencies)
# → see: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr
# → code generation informed by real dependency graph
# → QE review targets tests at high-impact files
# → agentshield-scan checks new configs for security

# 5. Verify impact
"What files are affected by my changes?"
# → blast radius calculation → targeted test generation

Architecture-aware development: every skill knows the codebase structure.


Example: AI-Assisted Reasoning & Self-Improvement

# Auto-select reasoning strategy:
"Compare 3 architectures"      → structured-reasoning: Tree-of-Thought (branches + scoring)
"Debug this test"              → structured-reasoning: Chain-of-Thought (linear trace)
"We've been looping"           → structured-reasoning: Reflection-Suppression (break loop)

# Self-review before delivering:
"Write a migration and verify" → reflection-loop: draft → critique → revise (max 3 rounds)

# Manage long sessions:
"Context is getting long"      → context-window-management: checkpoint + prune + continue

# Learn from success:
"Extract this as a skill"      → skill-crystallizer: trace → reusable SKILL.md

All included in meta preset.


Self-Learning: JSONL vs AgentDB

DZ Harness supports two memory backends for self-learning:

dz setup --target claude-code --preset devops                    # JSONL (default, lightweight)
dz setup --target claude-code --preset devops --memory agentdb   # AgentDB (vector memory)
Capability JSONL (default) AgentDB (--memory agentdb)
Session tracking Append-only JSONL log HNSW vector store (.rvf)
Pattern storage dz teach → patterns.jsonl dz teach → .rvf + agentdb_pattern_store
Search Keyword (grep) Semantic (HNSW nearest-neighbor, cosine similarity)
Retrieval Sequential scan O(log n) approximate nearest neighbor
Self-learning Frequency-based 9 RL algorithms + Thompson Sampling bandit
Memory tiers Flat file 3-tier (working → short-term → long-term)
Reflexion Reward scores (0-1) Episodic memory (task + outcome + self-critique)
Causal reasoning No Cypher-like graph queries (X caused Y)
Skill composition Manual (presets) Bandit-picked skill chains (A→B→C)
Audit trail No Cryptographic attestation log
Size ~0 KB 4.6 MB (agentdb)
MCP tools 0 41 tools (pattern, reflexion, causal, skill, hierarchy)
Dependencies None agentdb (optional, via npx)

AgentDB self-learning algorithms

When using --memory agentdb, the following algorithms automatically tune search quality:

  1. Thompson Sampling — multi-armed bandit for ranking search results
  2. UCB1 (Upper Confidence Bound) — exploration-exploitation balancing
  3. EXP3 — adversarial bandit for non-stationary environments
  4. Softmax — temperature-based action selection
  5. Epsilon-Greedy — simple exploration with decay
  6. Gradient Bandit — preference-based action selection
  7. Contextual Bandit — context-aware ranking using features
  8. REINFORCE — policy gradient for complex reward landscapes
  9. PPO-lite — proximal policy optimization for stable learning

The bandit automatically selects the best algorithm for your usage pattern — no manual tuning needed.

How to enable AgentDB

# One command — everything is set up:
dz setup --target claude-code --preset devops --memory agentdb

This creates .dz/memory.rvf, registers the agentdb MCP server (41 tools), and configures session hooks. The agent can immediately use agentdb_pattern_store, agentdb_reflexion_recall, etc. — no additional dz init needed.

Command When to use
dz setup --memory agentdb Recommended — full setup in one step
dz init --select agentdb-memory Lightweight — only the SKILL.md guide (see below)

What does dz init --select agentdb-memory actually do?

This is the lightweight path — it installs only the skill documentation, without configuring the backend:

Step 1: Auto-discovers agentdb-memory/ in skills-mcp package
Step 2: Copies to .claude/skills/agentdb-memory/
          ├── SKILL.md              ← instructions for the agent
          ├── schemas/output.json
          ├── scripts/validate-config.json
          └── evals/agentdb-memory.yaml

Step 3: Claude Code auto-discovers the skill from .claude/skills/
Step 4: When agent encounters a matching task, it reads SKILL.md
Step 5: SKILL.md teaches the agent WHICH tools to call and WHEN

What it does NOT do (unlike dz setup --memory agentdb):

  • Does NOT create .dz/memory.rvf
  • Does NOT register agentdb MCP server
  • Does NOT configure session hooks

After dz init --select agentdb-memory, the user must manually add the MCP server:

claude mcp add agentdb -- npx agentdb@latest mcp start

When this is useful:

  • You already have agentdb installed separately and just want the skill guide
  • You want to teach the agent about agentdb tools without committing to the full .dz/ infrastructure
  • You're in a team where agentdb is managed centrally but each developer needs the skill docs

How it works

  • dz init compiles canonical skills from the agentskills.io standard into the target platform's layout
  • Writing is additive — existing files are never overwritten without --force
  • All 5 platform adapters produce byte-identical output (ADR-005)
  • dz doctor runs 7 health checks (node version, adapters, config, SQLite, skills)
  • dz migrate detects legacy keysarium/bto installations and recommends migration path

Use Cases

1. Short-term product research (one-off study)

Goal: Quickly research a product idea, competitors, market — get a structured report.

# Option A: via dz CLI
dz init --target claude-code --preset meta
# Then in Claude Code:
#   /explore "Research the market for AI-powered code review tools"
#   /feature-adr "Summarize findings into an ADR"

# Option B: via keysarium (full 7-phase pipeline)
npx @dzhechkov/keysarium init
# Then in Claude Code:
#   /casarium "AI-powered code review tools — market analysis"
#   → Phase 0: Discovery → Phase 1: Exploration → Phase 2: Paranoid Research
#   → Phase 3: Solution Design → Phase 4: Architecture → Phase 5: Presentation

What you get:

  • meta preset: /explore clarifies the problem → /feature-adr structures findings as ADR decisions
  • keysarium: full 7-phase pipeline with dream cycles, background workers, and presentation generation

Best for: Quick study (hours), competitive analysis, technology evaluation.


2. Long-term product research (evolving over time)

Goal: Continuously gather data, add new sources, and "recalculate" the product vision as insights accumulate.

# Install keysarium (research pipeline) + evidence-wiki (knowledge base)
npx @dzhechkov/keysarium init
# Copy evidence-wiki plugin into your project:
npx @dzhechkov/evidence-wiki   # or git clone https://github.com/djd1m/evidence-wiki

npm install -g @dzhechkov/harness-cli
dz init --target claude-code --preset meta

Workflow — iterative research cycles with evidence wiki:

Week 1:  /casarium "Product X — initial research"
         → researches/ directory created with findings
         → .keysarium/memory/ stores patterns + reward scores

         /wiki-generate                              ← evidence-wiki
         → Scans researches/, ADRs, docs
         → Generates wiki/concepts/*.md (atomic pages with inline sources)
         → Builds wiki/graph.json (knowledge graph)
         → wiki/INDEX.md links everything

Week 2:  Add new data → /casarium "Product X — update with Q2 metrics"
         → Memory recalls Week 1 patterns (reward-calibrated learning)
         → New findings merged with existing, conflicts resolved

         /wiki-generate --check                      ← re-generates wiki
         → New concepts added, existing updated
         → Every claim verified: triple-pillar protocol requires N independent
           typed sources (ADR + methodology + research)
         → Stale concepts flagged, broken evidence links detected

         /triple-check wiki/concepts/pricing-model.md ← verify specific page
         → Checks that every factual claim has inline source citations
         → Flags unsupported statements

Week N:  /casarium "Product X — pivot analysis after customer feedback"
         → Full history in memory layer + evidence wiki
         → /harvest extracts reusable knowledge patterns
         → /wiki-generate rebuilds the entire knowledge graph
         → Product vision "recalculated" — the wiki IS the living product model

The evidence-wiki advantage:

Without evidence-wiki With evidence-wiki
Research in markdown files Atomic concept pages with inline sources
Findings scattered across researches/ Interlinked knowledge graph (graph.json)
"I think we decided X" Every claim has a cited source (triple-pillar)
Hard to see what changed /wiki-generate --check diffs the knowledge base
No verification /triple-check enforces evidence discipline

Key features for long-term research:

  • Evidence wiki (@dzhechkov/evidence-wiki): atomic concept pages where every factual claim carries inline sources; knowledge graph for cross-referencing; triple-pillar protocol (N independent typed sources per claim)
  • Reward-calibrated memory (@dzhechkov/memory Reflexion): each checkpoint response trains the system — "ок" = excellent (1.0), feedback = good (0.7), rework = needs_work (0.3)
  • Agent SDK Dreaming: between sessions, patterns are consolidated and distilled
  • /harvest (knowledge-extractor skill): extracts reusable patterns from completed research into lib/ templates
  • SQLite + FTS5 backend: scales to 100k+ records with full-text search across all research sessions

Best for: Product strategy over months, continuous market monitoring, evolving product vision with evidence-backed decisions.


3. Product research + working prototype

Goal: Research the product AND build a functional prototype.

Option A: Sequential — research first, then code

# Step 1: Install research + development presets
npx @dzhechkov/keysarium init
# OR:
dz init --target claude-code --preset keysarium

# Step 2: Research phase
#   /casarium "SaaS platform for team retrospectives"
#   → Phase 0-2: Discovery, Exploration, Paranoid Research
#   → Phase 3: Solution Design (with CJM prototype)
#   → Result: researches/<slug>/ with full analysis

# Step 3: Switch to development
dz init --target claude-code --preset feature-adr

# Step 4: Build using research outputs
#   /feature-adr "Build the retrospective platform based on research in researches/<slug>/"
#   → Step 0: Router classifies as L/XL
#   → Step 1-5: Requirements, ADRs, DDD, Architecture (informed by research)
#   → Step 6: Implementation plan
#   → Step 7: Code generation (with /frontend-design for UI)
#   → Step 8-9: QE review + fleet assessment

What you get: Research artifacts in researches/, then code in features/<slug>/ + actual repository changes. Research directly feeds into ADR decisions.

Option B: Parallel — research and code simultaneously with p-replicator

# Install the full product development toolkit
npx @dzhechkov/p-replicator init

# Single pipeline: research → requirements → prototype
#   /replicate "SaaS platform for team retrospectives"
#   → Reverse-engineers similar products (reverse-engineering-unicorn)
#   → Generates SPARC PRD (sparc-prd-mini)
#   → Validates requirements (requirements-validator)
#   → Creates the project structure (pipeline-forge)
#   → Builds the prototype (cc-toolkit-generator-enhanced)
#   → Reviews with brutal honesty (brutal-honesty-review)

What you get: A working prototype generated from research in a single /replicate pipeline run. Faster but less deep than Option A.

Comparison

Aspect Option A (Sequential) Option B (p-replicator)
Research depth Deep (7-phase keysarium) Moderate (reverse-engineering)
Code quality High (11-step feature-adr + QE) Good (pipeline-forge + review)
Time Days to weeks Hours to days
Best for Complex products, regulated domains MVPs, hackathons, quick validation
Packages keysarium + feature-adr preset p-replicator
Research artifacts researches/ directory Embedded in PRD
Code artifacts features/<slug>/ + repo changes Generated project

Tip: For maximum rigor, combine both — use p-replicator for a quick prototype, then run /feature-adr --full-qe-extended on the generated code for production-grade quality engineering.


Status

v0.3.85 — published on npm. Also available as Claude Plugin. Part of DZ Harness Hub.

Claude Plugin

DZ Harness Hub is available as a Claude Code plugin:

# Via marketplace (when published):
claude plugin marketplace add djd1m/dz-harness-hub
claude plugin install dz-harness-hub@dz-harness-hub

# Or test locally:
claude --plugin-dir /path/to/dz-harness-hub

# Generate plugin manifest from current inventory:
dz plugin --version 0.3.85

The .claude-plugin/ directory contains plugin.json + marketplace.json compatible with pi-claude-marketplace and skill-hub.

Skill sources

  • agentic-qe — 20 QE skills + 55 agents (test generation, coverage, chaos, QCSD swarms)
  • ECC — 20 curated skills (agent patterns, autonomous loops, docker, git workflows)
  • AgentShield — Security scanning (170 rules for .claude/ configs)
  • Understand-Anything — Codebase knowledge graph → architecture context

Platform & infrastructure

  • AgentDB — Self-learning vector memory (--memory agentdb, 41 MCP tools)
  • agentskills.io — Open standard for SKILL.md format (adopted by all 5 platforms)
  • OpenAI Codex — 2nd target platform
  • OpenCode — 3rd target platform (160K+ stars)
  • Hermes Agent — 4th target platform
  • OpenClaude — 5th target platform (28K+ stars)