JSPM – @vinaes/succ@1.5.42

Package Exports

@vinaes/succ
@vinaes/succ/api
@vinaes/succ/mcp

Readme

Semantic Understanding for Code Contexts

Quick Start • Features • Commands • Configuration • Docs

Persistent semantic memory for any MCP-compatible AI editor. Remember decisions, learn from mistakes, never lose context.

Works with

Editor	Setup
Claude Code	`succ init` (auto-configured)
Cursor	`succ setup cursor`
Windsurf	`succ setup windsurf`
Continue.dev	`succ setup continue`
Codex	`succ setup codex`, then always launch via `succ codex`

See Editor Guides for detailed setup.

Quick Start

npm install -g @vinaes/succ

cd your-project
succ init
succ index
succ index-code
succ analyze

That's it. Claude Code now has persistent memory for your project.

Features

Feature	Description
Hybrid Search	Semantic embeddings + BM25 keyword matching with cross-encoder reranking and AST symbol boost
AST Code Indexing	Tree-sitter parsing for 21 languages — 13 with full symbol extraction, 8 grammar-only
Code Scanning	Recursive code discovery and indexing via `succ_index action="scan"` with .succignore support
Brain Vault	Obsidian-compatible markdown knowledge base with hierarchical summaries
Persistent Memory	Decisions, learnings, patterns across sessions with auto-extraction
Cross-Project	Global memories shared between all projects; cross-repo search
Knowledge Graph	Directed graph with PPR, SCC, articulation points, bridge edges (code↔memory), LLM-enriched relations
Graph Algorithms	Personalized PageRank, Louvain communities, Dijkstra shortest path, betweenness centrality, co-change analysis
MCP Native	15 consolidated tools — Claude uses succ tools directly
Reranker	ONNX cross-encoder (ms-marco-MiniLM-L6-v2) for search result post-processing
HyDE	Hypothetical Document Embeddings — LLM generates code snippets for NL queries to bridge embedding gap
Late Chunking	Long-context embedding with per-AST-chunk pooling for context-aware chunks
Web Search	Real-time web search via Perplexity Sonar (quick, quality, deep research)
Web Fetch	Fetch any URL as clean Markdown via md.succ.ai (Readability + Playwright)
LSP Integration	Language Server Protocol client for definition, references, hover queries
Working Memory	Priority scoring, validity filtering, diversity, pinned memories
Dynamic Hook Rules	Save memories that auto-fire as pre-tool rules — inject context, block, ask confirmation, or auto-approve permissions
File-Linked Memories	Link memories to files; auto-recalled when editing those files
Dead-End Tracking	Record failed approaches to prevent retrying
Debug Sessions	Structured debugging with hypothesis testing, 13-language instrumentation
Session Surgeon	Auto compact stats, trim tool content/thinking/images, manual compact with chain integrity
Observability	Search latency, embedding times, LLM call metrics; retrieval feedback loop
PRD Pipeline	Generate PRDs, parse into tasks, execute with quality gates
Team Mode	Parallel task execution with git worktrees
Multi-Backend Storage	SQLite, PostgreSQL, Qdrant — scale from laptop to cloud

All features

AST Code Indexing — Tree-sitter parsing for 21 languages (13 with full symbol extraction + 8 grammar-only); symbol-aware BM25 tokenization boosts function/class names in search results
Web Search — Real-time search via Perplexity Sonar through OpenRouter (quick $1/MTok, quality $3-15/MTok, deep research); search history tracking with cost auditing
PRD Pipeline — Generate PRDs from feature descriptions, parse into executable tasks, run with Claude Code agent, export workflow to Obsidian (Mermaid Gantt + dependency DAG)
Team Mode — Parallel task execution using git worktrees; each worker gets an isolated checkout, results merge via cherry-pick
Quality Gates — Auto-detected (TypeScript, Go, Python, Rust) or custom; run after each task to verify code quality
Graph Algorithms — Personalized PageRank (PPR) retrieval, Tarjan's SCC, articulation points, Dijkstra shortest path, betweenness centrality, Louvain communities with LLM summaries, bridge edges (code↔memory graph)
Cross-encoder Reranker — ONNX ms-marco-MiniLM-L6-v2 rescores (query, document) pairs; configurable weight, topK clamping, graceful degradation
HyDE — Hypothetical Document Embeddings via LLM for natural language → code search; tree-sitter AST code detection
Late Chunking — Long-context embedding (jina 8192 tokens) with per-AST-chunk pooling for context-aware embeddings
Hierarchical Summaries (RAPTOR-style) — bottom-up LLM summarization at file → directory → module → repo zoom levels with query routing
Code Scanning — succ_index action="scan" recursively discovers and indexes code files via git ls-files / directory walk with .succignore, size filtering, symlink rejection
Observability — search latency, embedding times, LLM call metrics; retrieval feedback loop for ranking adjustment
Auto-memory Extraction — session-end fact extraction via LLM with quality gate + periodic dimension-bucketed consolidation
Cross-repo Search — search across multiple succ-indexed repositories
Diff-brain Analysis — LLM-powered diff analysis for brain vault document changes
LSP Integration — language server protocol client, installer, and server registry (Kotlin + Swift added)
MCP Review Tool — succ_review for code review with blast-radius estimation
Co-change Analysis — git log mining to detect files frequently changed together
Brain Vault Export — structured export with metadata
API Versioning — /v1/ route prefix aliases for all daemon endpoints
Graph Enrichment — LLM-classified relations (implements, leads_to, contradicts...), contextual proximity, Label Propagation communities, degree centrality with recall boost
Dead-End Tracking — Record failed approaches; auto-boosted in recall to prevent retrying
AGENTS.md Auto-Export — Auto-generate editor instructions from decisions, patterns, dead-ends
Learning Delta — Track knowledge growth per session (memories added, types, quality)
Confidence Retention — Time-decay scoring with auto-cleanup of low-value memories
Safe Consolidation — Soft-delete with undo support; no data loss on merge
Skill Discovery — Auto-suggest relevant skills based on user prompt (opt-in, disabled by default)
Skyll Integration — Access community skills from Skyll registry (requires skills.enabled = true)
Soul Document — Define AI personality and values
Dynamic Hook Rules — Memories tagged hook-rule auto-fire before matching tool calls; filter by tool:{Name} and match:{regex} tags; error type blocks, pattern asks confirmation, allow type auto-approves permission dialogs (v2.1.63+), others inject as context
PermissionRequest — Auto-approve or deny Claude Code permission dialogs based on memory rules (requires Claude Code v2.1.63+)
HTTP Hooks — Direct HTTP hooks to daemon (no process spawn) for faster, more reliable hook execution (requires Claude Code v2.1.63+, auto-detected at succ init)
File-Linked Memories — Attach memories to files via files parameter; pre-tool hook auto-recalls related memories when editing those files
Auto-Hooks — Context injection at session start/end
Idle Reflections — AI generates insights during idle time
Session Context — Auto-generated briefings for next session
Security Hardening — 3-tier prompt injection detection (structural + multilingual regex + embedding semantic), content sanitization for 13 entry points, Bell-LaPadula IFC with compartments, file operation guards, exfiltration detection, post-tool secret scanning
LLM Guardrails — Optional Tier 3 LLM classification (Llama Guard, safeguard-20b) for sensitivity, code policy (OWASP SC2-SC7), and injection detection with LRU caching
Sensitive Filter — Detect and redact PII, API keys, secrets
Quality Scoring — Local ONNX classification to filter noise
Token Savings — Track RAG efficiency vs full files
Temporal Awareness — Time decay, validity periods, point-in-time queries
Unified Daemon — Single background process for watch, analyze, idle tracking
Watch Mode — Auto-reindex on file changes via @parcel/watcher
Fast Analyze — --fast mode with fewer agents and smaller context for quick onboarding
Incremental Analyze — Git-based change detection, skip unchanged agents
Local LLM — Ollama, LM Studio, llama.cpp support
Sleep Agent — Offload heavy operations to local LLM
Checkpoints — Backup and restore full succ state
AI-Readiness Score — Measure project readiness for AI collaboration
Multiple LLM Backends — Local (Ollama), OpenRouter, or Claude CLI
Storage Backends — SQLite (default), PostgreSQL + pgvector, Qdrant
Data Migration — Export/import JSON, migrate between backends

Claude Code Agents

succ ships with 20 specialized agents in .claude/agents/ that run as subagents inside Claude Code:

Agent	What it does
`succ-explore`	Codebase exploration powered by semantic search
`succ-plan`	TDD-enforced implementation planning with red-green-refactor cycles
`succ-code-reviewer`	Full code review with OWASP Top 10 checklist — works with any language
`succ-diff-reviewer`	Fast pre-commit diff review for security, bugs, and regressions
`succ-deep-search`	Cross-search memories, brain vault, and code
`succ-memory-curator`	Consolidate, deduplicate, and clean up memories
`succ-memory-health-monitor`	Detect decayed, stale, or low-quality memories
`succ-pattern-detective`	Surface recurring patterns and anti-patterns from sessions
`succ-session-handoff-orchestrator`	Extract summary and briefing at session end
`succ-session-reviewer`	Review past sessions, extract missed learnings
`succ-decision-auditor`	Find contradictions and reversals in architectural decisions
`succ-knowledge-indexer`	Index documentation and code into the knowledge base
`succ-knowledge-mapper`	Maintain knowledge graph, find orphaned memories
`succ-checkpoint-manager`	Create and manage state backups
`succ-context-optimizer`	Optimize what gets preloaded at session start
`succ-quality-improvement-coach`	Analyze memory quality, suggest improvements
`succ-readiness-improver`	Actionable steps to improve AI-readiness score
`succ-general`	General-purpose agent with semantic search, web search, and all tools
`succ-debug`	Structured debugging — hypothesize, instrument, reproduce, fix with dead-end tracking
`succ-style-tracker`	Track communication style changes, update soul.md and brain vault

Agents are auto-discovered by Claude Code from .claude/agents/ and can be launched via the Task tool with subagent_type.

Commands

Command	Description
`succ init`	Interactive setup wizard
`succ setup <editor>`	Configure MCP for any editor
`succ codex-chat`	Launch Codex chat with succ briefing/hooks
`succ analyze`	Generate brain vault with Claude agents
`succ index [path]`	Index files for semantic search
`succ scan-code [path]`	Recursive code discovery and indexing
`succ search <query>`	Semantic search in brain vault
`succ remember <content>`	Save to memory
`succ memories`	List and search memories
`succ watch`	Watch for changes and auto-reindex
`succ daemon <action>`	Manage unified daemon
`succ prd generate`	Generate PRD from feature description
`succ prd run`	Execute PRD tasks with quality gates
`succ session analyze`	Token breakdown by type and tool name
`succ session trim`	Trim tool content from session transcript
`succ session compact`	Manual compact with dialogue summary
`succ status`	Show index statistics

All commands

Command	Description
`succ index-code [path]`	Index source code (AST chunking via tree-sitter)
`succ index --memories`	Re-embed all memories with current embedding model
`succ reindex`	Detect and fix stale/deleted index entries
`succ chat <query>`	RAG chat with context
`succ train-bpe`	Train BPE vocabulary from indexed code
`succ forget`	Delete memories
`succ graph <action>`	Knowledge graph: stats, auto-link, enrich, proximity, communities, centrality
`succ consolidate`	Merge duplicate memories (soft-delete with undo)
`succ agents-md`	Generate .claude/AGENTS.md from memories
`succ progress`	Show learning delta history
`succ retention`	Memory retention analysis and cleanup
`succ soul`	Generate personalized soul.md
`succ config`	Interactive configuration
`succ stats`	Show token savings statistics
`succ checkpoint <action>`	Create, restore, or list checkpoints
`succ score`	Show AI-readiness score
`succ prd parse <file>`	Parse PRD markdown into tasks
`succ prd list`	List all PRDs
`succ prd status [id]`	Show PRD status and tasks
`succ prd archive [id]`	Archive a PRD
`succ prd export [id]`	Export PRD workflow to Obsidian (Mermaid diagrams)
`succ session trim`	Trim tool content from session (`--tools`, `--only-inputs`, `--only-results`)
`succ session trim-thinking`	Trim thinking blocks only
`succ session trim-all`	Trim all strippable content (tools, thinking, images)
`succ session compact`	Manual compact with dialogue summary and chain integrity
`succ clear`	Clear index and/or memories
`succ benchmark`	Run performance benchmarks
`succ migrate`	Migrate data between storage backends

succ init

succ init                # Interactive mode
succ init --yes          # Non-interactive (defaults)
succ init --force        # Reinitialize existing project

Creates .succ/ structure, configures MCP server, sets up hooks.

succ analyze

succ analyze             # Run via Claude CLI (recommended)
succ analyze --fast      # Fast mode (fewer agents, smaller context)
succ analyze --force     # Force full re-analysis (skip incremental)
succ analyze --local     # Use local LLM (Ollama, LM Studio)
succ analyze --openrouter # Use OpenRouter API
succ analyze --background # Run in background

Generates brain vault structure:

.succ/brain/
├── CLAUDE.md              # Navigation hub
├── project/               # Project knowledge
│   ├── technical/         # Architecture, API, Conventions
│   ├── systems/           # Core systems/modules
│   ├── strategy/          # Project goals
│   └── features/          # Implemented features
├── knowledge/             # Research notes
└── archive/               # Old/superseded

succ watch

succ watch               # Start watch service (via daemon)
succ watch --ignore-code # Watch only docs
succ watch --status      # Check watch service status
succ watch --stop        # Stop watch service

succ daemon

succ daemon status       # Show daemon status
succ daemon sessions     # List active Claude Code sessions
succ daemon start        # Start daemon manually
succ daemon stop         # Stop daemon
succ daemon logs         # Show recent logs

succ prd

succ prd generate "Add JWT authentication"   # Generate PRD + parse tasks
succ prd run                                  # Execute sequentially (default)
succ prd run --mode team                      # Execute in parallel (git worktrees)
succ prd run --mode team --concurrency 5      # Parallel with 5 workers
succ prd run --resume                         # Resume interrupted run
succ prd run --dry-run                        # Preview execution plan
succ prd status                               # Show latest PRD status
succ prd list                                 # List all PRDs
succ prd export                               # Export latest PRD to Obsidian
succ prd export --all                         # Export all PRDs
succ prd export prd_abc123                    # Export specific PRD

Team mode runs independent tasks in parallel using git worktrees for isolation. Each worker gets its own checkout; results merge via cherry-pick. Quality gates (typecheck, test, lint, build) run automatically after each task.

Export generates Obsidian-compatible markdown with Mermaid diagrams (Gantt timeline, dependency DAG), per-task detail pages with gate results, and wiki-links between pages. Output goes to .succ/brain/prd/.

Configuration

No API key required. Uses local embeddings by default.

{
  "llm": {
    "embeddings": {
      "mode": "local",
      "model": "Xenova/all-MiniLM-L6-v2"
    }
  },
  "chunk_size": 500,
  "chunk_overlap": 50
}

Embedding modes

Local (default):

{
  "llm": { "embeddings": { "mode": "local" } }
}

Ollama (unified namespace):

{
  "llm": {
    "embeddings": {
      "mode": "api",
      "model": "nomic-embed-text",
      "api_url": "http://localhost:11434/v1/embeddings"
    }
  }
}

OpenRouter:

{
  "embedding_mode": "openrouter",
  "openrouter_api_key": "sk-or-..."
}

MRL dimension override (Matryoshka models):

{
  "llm": {
    "embeddings": {
      "mode": "api",
      "model": "nomic-embed-text-v1.5",
      "api_url": "http://localhost:11434/v1/embeddings",
      "dimensions": 256
    }
  }
}

GPU acceleration

succ uses native ONNX Runtime for embedding inference with automatic GPU detection:

Platform	Backend	GPUs
Windows	DirectML	AMD, Intel, NVIDIA
Linux	CUDA	NVIDIA
macOS	CoreML	Apple Silicon
Fallback	CPU	Any

GPU is enabled by default. No manual configuration needed — the best available backend is auto-detected.

{
  "gpu_enabled": true,
  "gpu_device": "directml"
}

Set gpu_device to override auto-detection: cuda, directml, coreml, or cpu.

Idle watcher

{
  "idle_watcher": {
    "enabled": true,
    "idle_minutes": 2,
    "check_interval": 30,
    "min_conversation_length": 5
  }
}

Pre-commit review

Automatically run the succ-diff-reviewer agent before every git commit to catch security issues, bugs, and regressions:

{
  "preCommitReview": true
}

When enabled, Claude will run a diff review before each commit. Critical findings block the commit; high findings trigger a warning.

Disabled by default. Set via succ_config(action="set", key="preCommitReview", value="true").

Autonomous mode (bypass security guards)

When running with --dangerously-skip-permissions, succ's security guards can block autonomous operations. Enable trustAgentPermissions to downgrade deny/ask to context warnings:

{
  "security": {
    "trustAgentPermissions": true
  }
}

Injection detection stays ON (protects the agent). See Security Hardening for details.

Sleep agent

Offload heavy operations to local LLM:

{
  "idle_reflection": {
    "sleep_agent": {
      "enabled": true,
      "mode": "local",
      "model": "qwen2.5-coder:14b",
      "api_url": "http://localhost:11434/v1"
    }
  }
}

Storage backends

succ supports multiple storage backends for different deployment scenarios:

Setup	Use Case	Requirements
SQLite + sqlite-vec	Local development (default)	None
PostgreSQL + pgvector	Production/cloud	PostgreSQL 15+ with pgvector
SQLite + Qdrant	Local + powerful vector search	Qdrant server
PostgreSQL + Qdrant	Full production scale	PostgreSQL + Qdrant

Example: PostgreSQL + pgvector

{
  "storage": {
    "backend": "postgresql",
    "postgresql": {
      "connection_string": "postgresql://user:pass@localhost:5432/succ"
    }
  }
}

Example: PostgreSQL + Qdrant

{
  "storage": {
    "backend": "postgresql",
    "vector": "qdrant",
    "postgresql": { "connection_string": "postgresql://..." },
    "qdrant": { "url": "http://localhost:6333" }
  }
}

See Storage Configuration for all options.

LLM Backend Configuration

succ supports multiple LLM backends for operations like analyze, idle reflection, and skill suggestions:

{
  "llm": {
    "type": "local",
    "model": "qwen2.5:7b",
    "local": {
      "endpoint": "http://localhost:11434/v1/chat/completions"
    },
    "openrouter": {
      "model": "anthropic/claude-3-haiku"
    }
  }
}

Key	Values	Default	Description
`llm.type`	`local` / `openrouter` / `claude`	`local`	LLM provider
`llm.model`	string	per-type	Model name for the active type
`llm.transport`	`process` / `ws` / `http`	auto	How to talk to the backend

Transport auto-selects based on type: claude uses process (or ws for persistent WebSocket), local/openrouter use http.

WebSocket transport (transport: "ws") keeps a persistent connection to Claude CLI, avoiding process spawn overhead on repeated calls:

{
  "llm": {
    "type": "claude",
    "model": "sonnet",
    "transport": "ws"
  }
}

Per-backend model overrides for the fallback chain:

{
  "llm": {
    "type": "claude",
    "model": "sonnet",
    "transport": "ws",
    "local": { "endpoint": "http://localhost:11434/v1/chat/completions", "model": "qwen2.5:7b" },
    "openrouter": { "model": "anthropic/claude-3-haiku" }
  }
}

Claude backend usage

The claude backend integrates with an existing, locally running Claude Code session and is intended only for in-session developer assistance by the same user, including tasks such as file analysis, documentation, indexing, and session summarization.

It is not supported for unattended background processing, cloud deployments, or multi-user scenarios. For automated, long-running, or cloud workloads, use the local or openrouter backends instead.

Retention policies

{
  "retention": {
    "enabled": true,
    "decay_rate": 0.01,
    "access_weight": 0.1,
    "keep_threshold": 0.3,
    "delete_threshold": 0.15
  }
}

Hybrid Search

Combines semantic embeddings with BM25 keyword search. Code search includes AST symbol boost, regex post-filtering, and symbol type filtering (function, method, class, interface, type_alias). Three output modes: full (code blocks), lean (file+lines), signatures (symbol names only).

Aspect	Documents	Code
Tokenizer	Markdown-aware + stemming	Naming convention splitter + AST symbol boost
Stemming	Yes	No
Stop words	Filtered	Kept
Segmentation	Standard	Ronin + BPE
Symbol metadata	N/A	function, class, interface names via tree-sitter

Code tokenizer handles all naming conventions:

Convention	Example	Tokens
camelCase	`getUserName`	get, user, name
PascalCase	`UserService`	user, service
snake_case	`get_user_name`	get, user, name
SCREAMING_SNAKE	`MAX_RETRY_COUNT`	max, retry, count

Memory System

Local memory — stored in .succ/succ.db, project-specific.

Global memory — stored in ~/.succ/global.db, shared across projects.

succ remember "User prefers TypeScript" --global
succ memories --global

Architecture

your-project/
├── .claude/
│   └── settings.json      # Claude Code hooks config
└── .succ/
    ├── brain/             # Obsidian-compatible vault
    ├── hooks/             # Hook scripts
    ├── config.json        # Project configuration
    ├── soul.md            # AI personality
    └── succ.db            # Vector database

~/.succ/
├── global.db              # Global memories
└── config.json            # Global configuration

Documentation

Configuration Reference — All config options with examples
PRD Pipeline — Generate, execute, and verify tasks with quality gates
Storage Backends — SQLite, PostgreSQL, Qdrant setup and benchmarks
Benchmarks — Performance and accuracy metrics
Temporal Awareness — Time decay, validity periods
Ollama Setup — Recommended local LLM setup
llama.cpp GPU — GPU-accelerated embeddings
MCP Integration — Claude Code tools and resources
Security Hardening — Injection detection, IFC, guardrails, content sanitization
Troubleshooting — Common issues and fixes
Development — Contributing and testing

License

FSL-1.1-Apache-2.0 — Free to use, modify, self-host. Commercial cloud hosting restricted until Apache 2.0 date.