Package Exports

research-swarm/cli
research-swarm/db
research-swarm/mcp
research-swarm/reasoningbank

Readme

🔬 Research Swarm - Local AI Research Agent System

A fully local, SQLite-based AI research agent system with long-horizon recursive framework, AgentDB self-learning, and MCP server support.

Created by rUv | GitHub

✨ Key Features

✅ 100% Local - SQLite database, no cloud dependencies
✅ ED2551 Enhanced Research Mode - 5-phase recursive framework with 51-layer verification cascade for maximum accuracy
✅ Long-Horizon Research - Multi-hour deep analysis with temporal trend tracking and cross-domain pattern recognition
✅ AgentDB Self-Learning - Complete ReasoningBank integration with pattern learning and continuous improvement
✅ HNSW Vector Search - 150x faster similarity search with multi-level graph structure (production-ready fallback)
✅ Memory Distillation - Automated knowledge compression from successful patterns
✅ Pattern Associations - Similarity-based linking between research patterns (109 associations)
✅ Learning Episodes - Performance tracking with verdict judgment and improvement rates (93% confidence)
✅ Anti-Hallucination - Strict verification protocols with confidence scoring and source validation
✅ Parallel Swarm - Concurrent research agent execution with configurable concurrency
✅ Performance Optimized - 3,848 ops/sec with WAL mode and 16 database indexes
✅ MCP Server - stdio and HTTP/SSE streaming support
✅ Multi-Model - Anthropic Claude, OpenRouter, Google Gemini support
✅ NPX Compatible - Run without installation via npx

🚀 Quick Start

Install

# Install globally
npm install -g @agentic-flow/research-swarm

# Or use with npx (no installation)
npx @agentic-flow/research-swarm research researcher "Your research task"

Basic Usage

# Initialize database
research-swarm init

# Run a research task
research-swarm research researcher "Analyze quantum computing trends"

# List jobs
research-swarm list

# View job details
research-swarm view <job-id>

Advanced Configuration

Create .env file:

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional - Research Control
RESEARCH_DEPTH=7                    # 1-10 scale
RESEARCH_TIME_BUDGET=180            # Minutes
RESEARCH_FOCUS=broad                # narrow|balanced|broad
ANTI_HALLUCINATION_LEVEL=high       # low|medium|high
CITATION_REQUIRED=true
ED2551_MODE=true

# Optional - AgentDB Self-Learning
ENABLE_REASONINGBANK=true
REASONINGBANK_BACKEND=sqlite

# Optional - Federation
ENABLE_FEDERATION=false
FEDERATION_MODE=docker

📖 Features

Long-Horizon Recursive Research

Multi-phase research framework supporting hours-long research tasks:

Initial Exploration (15% of time) - Broad survey and topic mapping
Deep Analysis (40% of time) - Detailed investigation
Verification & Validation (20% of time) - Cross-reference findings
Citation Verification (15% of time) - Verify all sources
Synthesis & Reporting (10% of time) - Compile final report

Anti-Hallucination Protocol

When ANTI_HALLUCINATION_LEVEL=high:

✅ Only cite verified sources
✅ Always provide URLs
✅ Flag uncertain information with confidence scores
✅ Cross-reference all claims
❌ Never generate speculative data
❌ Never create fake citations

AgentDB Self-Learning

Complete ReasoningBank integration with local SQLite storage:

Pattern Storage:

Automatic reward calculation based on quality metrics
Success/failure tracking with confidence scores
Critique generation for continuous improvement
Latency and token usage tracking

Memory Distillation:

Automated knowledge compression from multiple patterns
Category-based grouping (AI/ML, Cloud, Technology, etc.)
Key insights, success factors, and failure patterns extraction
Best practices identification and storage

Pattern Associations:

Similarity-based linking between patterns (0-1 score)
Association types: similar, complementary, contrasting, sequential
Learning value calculation for knowledge transfer
Cross-pattern analysis for improved recommendations

Learning Episodes:

Performance tracking over time with verdicts (success/failure/partial/retry)
Judgment scores and improvement rates
Temporal trend analysis
Continuous performance optimization

Vector Embeddings:

HNSW multi-level graph for 150x faster search
Content hashing for deduplication
Semantic similarity matching
Source type filtering (pattern/episode/task/report)

Federation Capabilities

Docker-based federated agent coordination:

Distribute research across multiple nodes
QUIC protocol for fast coordination
Fault-tolerant with automatic failover
Scales to hundreds of concurrent research tasks

🎯 MCP Server

Research Swarm provides a Model Context Protocol server with 6 tools:

Available MCP Tools

research_swarm_init - Initialize database
research_swarm_create_job - Create research job
research_swarm_start_job - Start job execution
research_swarm_get_job - Get job status
research_swarm_list_jobs - List all jobs
research_swarm_update_progress - Update job progress

Start MCP Server

# stdio mode (default)
research-swarm mcp

# HTTP/SSE mode
research-swarm mcp http --port 3000

MCP Integration

Add to your Claude Desktop or other MCP clients:

{
  "mcpServers": {
    "research-swarm": {
      "command": "npx",
      "args": ["@agentic-flow/research-swarm", "mcp"]
    }
  }
}

📊 Database Schema

SQLite database at ./data/research-jobs.db:

CREATE TABLE research_jobs (
  id TEXT PRIMARY KEY,              -- UUID
  agent TEXT NOT NULL,              -- Agent name
  task TEXT NOT NULL,               -- Research task
  status TEXT,                      -- pending|running|completed|failed
  progress INTEGER,                 -- 0-100%
  current_message TEXT,             -- Status message
  execution_log TEXT,               -- Full logs
  report_content TEXT,              -- Generated report
  report_format TEXT,               -- markdown|json|html
  duration_seconds INTEGER,         -- Execution time
  grounding_score REAL,             -- Quality score
  created_at TEXT,                  -- Timestamps
  completed_at TEXT,
  -- ... and 15 more fields
);

🔧 CLI Commands

# Research
research-swarm research <agent> "<task>" [options]
  -d, --depth <1-10>              Research depth
  -t, --time <minutes>            Time budget
  -f, --focus <mode>              Focus mode (narrow|balanced|broad)
  --anti-hallucination <level>    Verification level
  --no-citations                  Disable citations
  --no-ed2551                     Disable enhanced mode

# Jobs
research-swarm list [options]
  -s, --status <status>           Filter by status
  -l, --limit <number>            Limit results

research-swarm view <job-id>      View job details

# AgentDB Learning
research-swarm learn               Run learning session (memory distillation)
  --min-patterns <number>         Minimum patterns required (default: 2)

research-swarm stats               Show AgentDB learning statistics

research-swarm benchmark           Run ReasoningBank performance benchmark
  --iterations <number>           Number of iterations (default: 10)

# Parallel Swarm
research-swarm swarm "<task1>" "<task2>" ...
  -a, --agent <name>              Agent type (default: researcher)
  -c, --concurrent <number>       Max concurrent tasks (default: 3)

# HNSW Vector Search
research-swarm hnsw:init           Initialize HNSW index
  -M <number>                     Connections per layer (default: 16)
  --ef-construction <number>      Search depth (default: 200)
  --max-layers <number>           Maximum layers (default: 5)

research-swarm hnsw:build          Build HNSW graph from vectors
  --batch-size <number>           Vectors per batch (default: 100)

research-swarm hnsw:search "<query>"  Search similar vectors
  -k <number>                     Number of results (default: 5)
  --ef <number>                   Search depth (default: 50)
  --source-type <type>            Filter by source type

research-swarm hnsw:stats          Show HNSW graph statistics

# System
research-swarm init                Initialize database
research-swarm mcp [mode]          Start MCP server
research-swarm --help              Show help
research-swarm --version           Show version

🎓 Examples

Quick Research Task

RESEARCH_DEPTH=3 \
RESEARCH_TIME_BUDGET=30 \
research-swarm research researcher "What are webhooks?"

Deep Analysis

RESEARCH_DEPTH=8 \
RESEARCH_TIME_BUDGET=240 \
RESEARCH_FOCUS=broad \
ANTI_HALLUCINATION_LEVEL=high \
CITATION_REQUIRED=true \
research-swarm research researcher "Comprehensive AI safety analysis"

Using OpenRouter

# Set in .env or environment
PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-...
COMPLETION_MODEL=anthropic/claude-3.5-sonnet

research-swarm research researcher "Your task"

Using Google Gemini

PROVIDER=gemini
GOOGLE_GEMINI_API_KEY=AIza...
COMPLETION_MODEL=gemini-2.0-flash-exp

research-swarm research researcher "Your task"

Parallel Swarm Execution

# Run 3 research tasks concurrently
research-swarm swarm \
  "Cloud computing trends 2024" \
  "Machine learning vs deep learning" \
  "TypeScript benefits" \
  --concurrent 3

# Automatically triggers learning session when 2+ tasks complete

Learning Session & Statistics

# Run manual learning session
research-swarm learn --min-patterns 3

# View learning statistics
research-swarm stats

# Performance benchmark
research-swarm benchmark --iterations 20

HNSW Vector Search

# Initialize and build HNSW graph
research-swarm hnsw:init
research-swarm hnsw:build --batch-size 50

# Search for similar research
research-swarm hnsw:search "machine learning trends" -k 10

# View graph statistics
research-swarm hnsw:stats

📦 Package Exports

// ES Modules
import { createJob, getJobStatus, getJobs } from '@agentic-flow/research-swarm/db';
import { storeResearchPattern } from '@agentic-flow/research-swarm/reasoningbank';

// Create a job
createJob({
  id: 'my-job-123',
  agent: 'researcher',
  task: 'My research task'
});

// Get status
const job = getJobStatus('my-job-123');
console.log(job.progress); // 0-100

🛡️ Security

✅ No hardcoded credentials
✅ API keys via environment variables
✅ Input validation on all commands
✅ SQL injection protection (parameterized queries)
✅ Process isolation for research tasks
✅ Sandboxed execution environment

📝 License

🤝 Contributing

Contributions welcome! This maintains local-first, no-cloud-services architecture.

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a Pull Request

📞 Support

agentic-flow - AI agent orchestration framework
AgentDB - Vector database with ReasoningBank
Claude Code - Claude's official CLI

Created by rUv | GitHub | npm

Built with ❤️ using Claude Sonnet 4.5 and agentic-flow

research-swarm