JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 30
  • Score
    100M100P100Q77628F
  • License FSL-1.1-Apache-2.0

Semantic Understanding for Code Contexts — persistent memory for AI coding assistants (Claude Code, Cursor, Windsurf, Continue.dev)

Package Exports

  • @vinaes/succ
  • @vinaes/succ/api
  • @vinaes/succ/mcp

Readme

succ

Semantic Understanding for Code Contexts

npm license

Quick StartFeaturesCommandsConfigurationDocs


Persistent semantic memory for any MCP-compatible AI editor. Remember decisions, learn from mistakes, never lose context.

Works with

Editor Setup
Claude Code succ init (auto-configured)
Cursor succ setup cursor
Windsurf succ setup windsurf
Continue.dev succ setup continue
Codex succ setup codex, then always launch via succ codex

See Editor Guides for detailed setup.

Quick Start

npm install -g @vinaes/succ
cd your-project
succ init
succ index
succ index-code
succ analyze

That's it. Claude Code now has persistent memory for your project.

Features

Feature Description
Hybrid Search Semantic embeddings + BM25 keyword matching with cross-encoder reranking and AST symbol boost
AST Code Indexing Tree-sitter parsing for 21 languages — 13 with full symbol extraction, 8 grammar-only
Code Scanning Recursive code discovery and indexing via succ_index action="scan" with .succignore support
Brain Vault Obsidian-compatible markdown knowledge base with hierarchical summaries
Persistent Memory Decisions, learnings, patterns across sessions with auto-extraction
Cross-Project Global memories shared between all projects; cross-repo search
Knowledge Graph Directed graph with PPR, SCC, articulation points, bridge edges (code↔memory), LLM-enriched relations
Graph Algorithms Personalized PageRank, Louvain communities, Dijkstra shortest path, betweenness centrality, co-change analysis
MCP Native 15 consolidated tools — Claude uses succ tools directly
Reranker ONNX cross-encoder (ms-marco-MiniLM-L6-v2) for search result post-processing
HyDE Hypothetical Document Embeddings — LLM generates code snippets for NL queries to bridge embedding gap
Late Chunking Long-context embedding with per-AST-chunk pooling for context-aware chunks
Web Search Real-time web search via Perplexity Sonar (quick, quality, deep research)
Web Fetch Fetch any URL as clean Markdown via md.succ.ai (Readability + Playwright)
LSP Integration Language Server Protocol client for definition, references, hover queries
Working Memory Priority scoring, validity filtering, diversity, pinned memories
Dynamic Hook Rules Save memories that auto-fire as pre-tool rules — inject context, block, ask confirmation, or auto-approve permissions
File-Linked Memories Link memories to files; auto-recalled when editing those files
Dead-End Tracking Record failed approaches to prevent retrying
Debug Sessions Structured debugging with hypothesis testing, 13-language instrumentation
Session Surgeon Auto compact stats, trim tool content/thinking/images, manual compact with chain integrity
Observability Search latency, embedding times, LLM call metrics; retrieval feedback loop
PRD Pipeline Generate PRDs, parse into tasks, execute with quality gates
Team Mode Parallel task execution with git worktrees
Multi-Backend Storage SQLite, PostgreSQL, Qdrant — scale from laptop to cloud
All features
  • AST Code Indexing — Tree-sitter parsing for 21 languages (13 with full symbol extraction + 8 grammar-only); symbol-aware BM25 tokenization boosts function/class names in search results
  • Web Search — Real-time search via Perplexity Sonar through OpenRouter (quick $1/MTok, quality $3-15/MTok, deep research); search history tracking with cost auditing
  • PRD Pipeline — Generate PRDs from feature descriptions, parse into executable tasks, run with Claude Code agent, export workflow to Obsidian (Mermaid Gantt + dependency DAG)
  • Team Mode — Parallel task execution using git worktrees; each worker gets an isolated checkout, results merge via cherry-pick
  • Quality Gates — Auto-detected (TypeScript, Go, Python, Rust) or custom; run after each task to verify code quality
  • Graph Algorithms — Personalized PageRank (PPR) retrieval, Tarjan's SCC, articulation points, Dijkstra shortest path, betweenness centrality, Louvain communities with LLM summaries, bridge edges (code↔memory graph)
  • Cross-encoder Reranker — ONNX ms-marco-MiniLM-L6-v2 rescores (query, document) pairs; configurable weight, topK clamping, graceful degradation
  • HyDE — Hypothetical Document Embeddings via LLM for natural language → code search; tree-sitter AST code detection
  • Late Chunking — Long-context embedding (jina 8192 tokens) with per-AST-chunk pooling for context-aware embeddings
  • Hierarchical Summaries (RAPTOR-style) — bottom-up LLM summarization at file → directory → module → repo zoom levels with query routing
  • Code Scanningsucc_index action="scan" recursively discovers and indexes code files via git ls-files / directory walk with .succignore, size filtering, symlink rejection
  • Observability — search latency, embedding times, LLM call metrics; retrieval feedback loop for ranking adjustment
  • Auto-memory Extraction — session-end fact extraction via LLM with quality gate + periodic dimension-bucketed consolidation
  • Cross-repo Search — search across multiple succ-indexed repositories
  • Diff-brain Analysis — LLM-powered diff analysis for brain vault document changes
  • LSP Integration — language server protocol client, installer, and server registry (Kotlin + Swift added)
  • MCP Review Toolsucc_review for code review with blast-radius estimation
  • Co-change Analysis — git log mining to detect files frequently changed together
  • Brain Vault Export — structured export with metadata
  • API Versioning/v1/ route prefix aliases for all daemon endpoints
  • Graph Enrichment — LLM-classified relations (implements, leads_to, contradicts...), contextual proximity, Label Propagation communities, degree centrality with recall boost
  • Dead-End Tracking — Record failed approaches; auto-boosted in recall to prevent retrying
  • AGENTS.md Auto-Export — Auto-generate editor instructions from decisions, patterns, dead-ends
  • Learning Delta — Track knowledge growth per session (memories added, types, quality)
  • Confidence Retention — Time-decay scoring with auto-cleanup of low-value memories
  • Safe Consolidation — Soft-delete with undo support; no data loss on merge
  • Skill Discovery — Auto-suggest relevant skills based on user prompt (opt-in, disabled by default)
  • Skyll Integration — Access community skills from Skyll registry (requires skills.enabled = true)
  • Soul Document — Define AI personality and values
  • Dynamic Hook Rules — Memories tagged hook-rule auto-fire before matching tool calls; filter by tool:{Name} and match:{regex} tags; error type blocks, pattern asks confirmation, allow type auto-approves permission dialogs (v2.1.63+), others inject as context
  • PermissionRequest — Auto-approve or deny Claude Code permission dialogs based on memory rules (requires Claude Code v2.1.63+)
  • HTTP Hooks — Direct HTTP hooks to daemon (no process spawn) for faster, more reliable hook execution (requires Claude Code v2.1.63+, auto-detected at succ init)
  • File-Linked Memories — Attach memories to files via files parameter; pre-tool hook auto-recalls related memories when editing those files
  • Auto-Hooks — Context injection at session start/end
  • Idle Reflections — AI generates insights during idle time
  • Session Context — Auto-generated briefings for next session
  • Security Hardening — 3-tier prompt injection detection (structural + multilingual regex + embedding semantic), content sanitization for 13 entry points, Bell-LaPadula IFC with compartments, file operation guards, exfiltration detection, post-tool secret scanning
  • LLM Guardrails — Optional Tier 3 LLM classification (Llama Guard, safeguard-20b) for sensitivity, code policy (OWASP SC2-SC7), and injection detection with LRU caching
  • Sensitive Filter — Detect and redact PII, API keys, secrets
  • Quality Scoring — Local ONNX classification to filter noise
  • Token Savings — Track RAG efficiency vs full files
  • Temporal Awareness — Time decay, validity periods, point-in-time queries
  • Unified Daemon — Single background process for watch, analyze, idle tracking
  • Watch Mode — Auto-reindex on file changes via @parcel/watcher
  • Fast Analyze--fast mode with fewer agents and smaller context for quick onboarding
  • Incremental Analyze — Git-based change detection, skip unchanged agents
  • Local LLM — Ollama, LM Studio, llama.cpp support
  • Sleep Agent — Offload heavy operations to local LLM
  • Checkpoints — Backup and restore full succ state
  • AI-Readiness Score — Measure project readiness for AI collaboration
  • Multiple LLM Backends — Local (Ollama), OpenRouter, or Claude CLI
  • Storage Backends — SQLite (default), PostgreSQL + pgvector, Qdrant
  • Data Migration — Export/import JSON, migrate between backends

Claude Code Agents

succ ships with 20 specialized agents in .claude/agents/ that run as subagents inside Claude Code:

Agent What it does
succ-explore Codebase exploration powered by semantic search
succ-plan TDD-enforced implementation planning with red-green-refactor cycles
succ-code-reviewer Full code review with OWASP Top 10 checklist — works with any language
succ-diff-reviewer Fast pre-commit diff review for security, bugs, and regressions
succ-deep-search Cross-search memories, brain vault, and code
succ-memory-curator Consolidate, deduplicate, and clean up memories
succ-memory-health-monitor Detect decayed, stale, or low-quality memories
succ-pattern-detective Surface recurring patterns and anti-patterns from sessions
succ-session-handoff-orchestrator Extract summary and briefing at session end
succ-session-reviewer Review past sessions, extract missed learnings
succ-decision-auditor Find contradictions and reversals in architectural decisions
succ-knowledge-indexer Index documentation and code into the knowledge base
succ-knowledge-mapper Maintain knowledge graph, find orphaned memories
succ-checkpoint-manager Create and manage state backups
succ-context-optimizer Optimize what gets preloaded at session start
succ-quality-improvement-coach Analyze memory quality, suggest improvements
succ-readiness-improver Actionable steps to improve AI-readiness score
succ-general General-purpose agent with semantic search, web search, and all tools
succ-debug Structured debugging — hypothesize, instrument, reproduce, fix with dead-end tracking
succ-style-tracker Track communication style changes, update soul.md and brain vault

Agents are auto-discovered by Claude Code from .claude/agents/ and can be launched via the Task tool with subagent_type.

Commands

Command Description
succ init Interactive setup wizard
succ setup <editor> Configure MCP for any editor
succ codex-chat Launch Codex chat with succ briefing/hooks
succ analyze Generate brain vault with Claude agents
succ index [path] Index files for semantic search
succ scan-code [path] Recursive code discovery and indexing
succ search <query> Semantic search in brain vault
succ remember <content> Save to memory
succ memories List and search memories
succ watch Watch for changes and auto-reindex
succ daemon <action> Manage unified daemon
succ prd generate Generate PRD from feature description
succ prd run Execute PRD tasks with quality gates
succ session analyze Token breakdown by type and tool name
succ session trim Trim tool content from session transcript
succ session compact Manual compact with dialogue summary
succ status Show index statistics
All commands
Command Description
succ index-code [path] Index source code (AST chunking via tree-sitter)
succ index --memories Re-embed all memories with current embedding model
succ reindex Detect and fix stale/deleted index entries
succ chat <query> RAG chat with context
succ train-bpe Train BPE vocabulary from indexed code
succ forget Delete memories
succ graph <action> Knowledge graph: stats, auto-link, enrich, proximity, communities, centrality
succ consolidate Merge duplicate memories (soft-delete with undo)
succ agents-md Generate .claude/AGENTS.md from memories
succ progress Show learning delta history
succ retention Memory retention analysis and cleanup
succ soul Generate personalized soul.md
succ config Interactive configuration
succ stats Show token savings statistics
succ checkpoint <action> Create, restore, or list checkpoints
succ score Show AI-readiness score
succ prd parse <file> Parse PRD markdown into tasks
succ prd list List all PRDs
succ prd status [id] Show PRD status and tasks
succ prd archive [id] Archive a PRD
succ prd export [id] Export PRD workflow to Obsidian (Mermaid diagrams)
succ session trim Trim tool content from session (--tools, --only-inputs, --only-results)
succ session trim-thinking Trim thinking blocks only
succ session trim-all Trim all strippable content (tools, thinking, images)
succ session compact Manual compact with dialogue summary and chain integrity
succ clear Clear index and/or memories
succ benchmark Run performance benchmarks
succ migrate Migrate data between storage backends

succ init

succ init                # Interactive mode
succ init --yes          # Non-interactive (defaults)
succ init --force        # Reinitialize existing project

Creates .succ/ structure, configures MCP server, sets up hooks.

succ analyze

succ analyze             # Run via Claude CLI (recommended)
succ analyze --fast      # Fast mode (fewer agents, smaller context)
succ analyze --force     # Force full re-analysis (skip incremental)
succ analyze --local     # Use local LLM (Ollama, LM Studio)
succ analyze --openrouter # Use OpenRouter API
succ analyze --background # Run in background

Generates brain vault structure:

.succ/brain/
├── CLAUDE.md              # Navigation hub
├── project/               # Project knowledge
│   ├── technical/         # Architecture, API, Conventions
│   ├── systems/           # Core systems/modules
│   ├── strategy/          # Project goals
│   └── features/          # Implemented features
├── knowledge/             # Research notes
└── archive/               # Old/superseded

succ watch

succ watch               # Start watch service (via daemon)
succ watch --ignore-code # Watch only docs
succ watch --status      # Check watch service status
succ watch --stop        # Stop watch service

succ daemon

succ daemon status       # Show daemon status
succ daemon sessions     # List active Claude Code sessions
succ daemon start        # Start daemon manually
succ daemon stop         # Stop daemon
succ daemon logs         # Show recent logs

succ prd

succ prd generate "Add JWT authentication"   # Generate PRD + parse tasks
succ prd run                                  # Execute sequentially (default)
succ prd run --mode team                      # Execute in parallel (git worktrees)
succ prd run --mode team --concurrency 5      # Parallel with 5 workers
succ prd run --resume                         # Resume interrupted run
succ prd run --dry-run                        # Preview execution plan
succ prd status                               # Show latest PRD status
succ prd list                                 # List all PRDs
succ prd export                               # Export latest PRD to Obsidian
succ prd export --all                         # Export all PRDs
succ prd export prd_abc123                    # Export specific PRD

Team mode runs independent tasks in parallel using git worktrees for isolation. Each worker gets its own checkout; results merge via cherry-pick. Quality gates (typecheck, test, lint, build) run automatically after each task.

Export generates Obsidian-compatible markdown with Mermaid diagrams (Gantt timeline, dependency DAG), per-task detail pages with gate results, and wiki-links between pages. Output goes to .succ/brain/prd/.

Configuration

No API key required. Uses local embeddings by default.

{
  "llm": {
    "embeddings": {
      "mode": "local",
      "model": "Xenova/all-MiniLM-L6-v2"
    }
  },
  "chunk_size": 500,
  "chunk_overlap": 50
}
Embedding modes

Local (default):

{
  "llm": { "embeddings": { "mode": "local" } }
}

Ollama (unified namespace):

{
  "llm": {
    "embeddings": {
      "mode": "api",
      "model": "nomic-embed-text",
      "api_url": "http://localhost:11434/v1/embeddings"
    }
  }
}

OpenRouter:

{
  "embedding_mode": "openrouter",
  "openrouter_api_key": "sk-or-..."
}

MRL dimension override (Matryoshka models):

{
  "llm": {
    "embeddings": {
      "mode": "api",
      "model": "nomic-embed-text-v1.5",
      "api_url": "http://localhost:11434/v1/embeddings",
      "dimensions": 256
    }
  }
}
GPU acceleration

succ uses native ONNX Runtime for embedding inference with automatic GPU detection:

Platform Backend GPUs
Windows DirectML AMD, Intel, NVIDIA
Linux CUDA NVIDIA
macOS CoreML Apple Silicon
Fallback CPU Any

GPU is enabled by default. No manual configuration needed — the best available backend is auto-detected.

{
  "gpu_enabled": true,
  "gpu_device": "directml"
}

Set gpu_device to override auto-detection: cuda, directml, coreml, or cpu.

Idle watcher
{
  "idle_watcher": {
    "enabled": true,
    "idle_minutes": 2,
    "check_interval": 30,
    "min_conversation_length": 5
  }
}
Pre-commit review

Automatically run the succ-diff-reviewer agent before every git commit to catch security issues, bugs, and regressions:

{
  "preCommitReview": true
}

When enabled, Claude will run a diff review before each commit. Critical findings block the commit; high findings trigger a warning.

Disabled by default. Set via succ_config(action="set", key="preCommitReview", value="true").

Autonomous mode (bypass security guards)

When running with --dangerously-skip-permissions, succ's security guards can block autonomous operations. Enable trustAgentPermissions to downgrade deny/ask to context warnings:

{
  "security": {
    "trustAgentPermissions": true
  }
}

Injection detection stays ON (protects the agent). See Security Hardening for details.

Sleep agent

Offload heavy operations to local LLM:

{
  "idle_reflection": {
    "sleep_agent": {
      "enabled": true,
      "mode": "local",
      "model": "qwen2.5-coder:14b",
      "api_url": "http://localhost:11434/v1"
    }
  }
}
Storage backends

succ supports multiple storage backends for different deployment scenarios:

Setup Use Case Requirements
SQLite + sqlite-vec Local development (default) None
PostgreSQL + pgvector Production/cloud PostgreSQL 15+ with pgvector
SQLite + Qdrant Local + powerful vector search Qdrant server
PostgreSQL + Qdrant Full production scale PostgreSQL + Qdrant

Example: PostgreSQL + pgvector

{
  "storage": {
    "backend": "postgresql",
    "postgresql": {
      "connection_string": "postgresql://user:pass@localhost:5432/succ"
    }
  }
}

Example: PostgreSQL + Qdrant

{
  "storage": {
    "backend": "postgresql",
    "vector": "qdrant",
    "postgresql": { "connection_string": "postgresql://..." },
    "qdrant": { "url": "http://localhost:6333" }
  }
}

See Storage Configuration for all options.

LLM Backend Configuration

succ supports multiple LLM backends for operations like analyze, idle reflection, and skill suggestions:

{
  "llm": {
    "type": "local",
    "model": "qwen2.5:7b",
    "local": {
      "endpoint": "http://localhost:11434/v1/chat/completions"
    },
    "openrouter": {
      "model": "anthropic/claude-3-haiku"
    }
  }
}
Key Values Default Description
llm.type local / openrouter / claude local LLM provider
llm.model string per-type Model name for the active type
llm.transport process / ws / http auto How to talk to the backend

Transport auto-selects based on type: claude uses process (or ws for persistent WebSocket), local/openrouter use http.

WebSocket transport (transport: "ws") keeps a persistent connection to Claude CLI, avoiding process spawn overhead on repeated calls:

{
  "llm": {
    "type": "claude",
    "model": "sonnet",
    "transport": "ws"
  }
}

Per-backend model overrides for the fallback chain:

{
  "llm": {
    "type": "claude",
    "model": "sonnet",
    "transport": "ws",
    "local": { "endpoint": "http://localhost:11434/v1/chat/completions", "model": "qwen2.5:7b" },
    "openrouter": { "model": "anthropic/claude-3-haiku" }
  }
}

Claude backend usage

The claude backend integrates with an existing, locally running Claude Code session and is intended only for in-session developer assistance by the same user, including tasks such as file analysis, documentation, indexing, and session summarization.

It is not supported for unattended background processing, cloud deployments, or multi-user scenarios. For automated, long-running, or cloud workloads, use the local or openrouter backends instead.

Retention policies
{
  "retention": {
    "enabled": true,
    "decay_rate": 0.01,
    "access_weight": 0.1,
    "keep_threshold": 0.3,
    "delete_threshold": 0.15
  }
}

Combines semantic embeddings with BM25 keyword search. Code search includes AST symbol boost, regex post-filtering, and symbol type filtering (function, method, class, interface, type_alias). Three output modes: full (code blocks), lean (file+lines), signatures (symbol names only).

Aspect Documents Code
Tokenizer Markdown-aware + stemming Naming convention splitter + AST symbol boost
Stemming Yes No
Stop words Filtered Kept
Segmentation Standard Ronin + BPE
Symbol metadata N/A function, class, interface names via tree-sitter

Code tokenizer handles all naming conventions:

Convention Example Tokens
camelCase getUserName get, user, name
PascalCase UserService user, service
snake_case get_user_name get, user, name
SCREAMING_SNAKE MAX_RETRY_COUNT max, retry, count

Memory System

Local memory — stored in .succ/succ.db, project-specific.

Global memory — stored in ~/.succ/global.db, shared across projects.

succ remember "User prefers TypeScript" --global
succ memories --global

Architecture

your-project/
├── .claude/
│   └── settings.json      # Claude Code hooks config
└── .succ/
    ├── brain/             # Obsidian-compatible vault
    ├── hooks/             # Hook scripts
    ├── config.json        # Project configuration
    ├── soul.md            # AI personality
    └── succ.db            # Vector database

~/.succ/
├── global.db              # Global memories
└── config.json            # Global configuration

Documentation

License

FSL-1.1-Apache-2.0 — Free to use, modify, self-host. Commercial cloud hosting restricted until Apache 2.0 date.