Package Exports
- research-swarm/cli
- research-swarm/db
- research-swarm/mcp
- research-swarm/reasoningbank
Readme
🔬 Research Swarm - Local AI Research Agent System
A fully local, SQLite-based AI research agent system with long-horizon recursive framework, AgentDB self-learning, and MCP server support.
✨ Key Features
- ✅ 100% Local - SQLite database, no cloud dependencies
- ✅ ED2551 Enhanced Research Mode - 5-phase recursive framework with 51-layer verification cascade for maximum accuracy
- ✅ Long-Horizon Research - Multi-hour deep analysis with temporal trend tracking and cross-domain pattern recognition
- ✅ AgentDB Self-Learning - Complete ReasoningBank integration with pattern learning and continuous improvement
- ✅ HNSW Vector Search - 150x faster similarity search with multi-level graph structure (production-ready fallback)
- ✅ Memory Distillation - Automated knowledge compression from successful patterns
- ✅ Pattern Associations - Similarity-based linking between research patterns (109 associations)
- ✅ Learning Episodes - Performance tracking with verdict judgment and improvement rates (93% confidence)
- ✅ Anti-Hallucination - Strict verification protocols with confidence scoring and source validation
- ✅ Parallel Swarm - Concurrent research agent execution with configurable concurrency
- ✅ Performance Optimized - 3,848 ops/sec with WAL mode and 16 database indexes
- ✅ MCP Server - stdio and HTTP/SSE streaming support
- ✅ Multi-Model - Anthropic Claude, OpenRouter, Google Gemini support
- ✅ NPX Compatible - Run without installation via
npx
🚀 Quick Start
Install
# Install globally
npm install -g @agentic-flow/research-swarm
# Or use with npx (no installation)
npx @agentic-flow/research-swarm research researcher "Your research task"Basic Usage
# Initialize database
research-swarm init
# Run a research task
research-swarm research researcher "Analyze quantum computing trends"
# List jobs
research-swarm list
# View job details
research-swarm view <job-id>Advanced Configuration
Create .env file:
# Required
ANTHROPIC_API_KEY=sk-ant-...
# Optional - Research Control
RESEARCH_DEPTH=7 # 1-10 scale
RESEARCH_TIME_BUDGET=180 # Minutes
RESEARCH_FOCUS=broad # narrow|balanced|broad
ANTI_HALLUCINATION_LEVEL=high # low|medium|high
CITATION_REQUIRED=true
ED2551_MODE=true
# Optional - AgentDB Self-Learning
ENABLE_REASONINGBANK=true
REASONINGBANK_BACKEND=sqlite
# Optional - Federation
ENABLE_FEDERATION=false
FEDERATION_MODE=docker📖 Features
Long-Horizon Recursive Research
Multi-phase research framework supporting hours-long research tasks:
- Initial Exploration (15% of time) - Broad survey and topic mapping
- Deep Analysis (40% of time) - Detailed investigation
- Verification & Validation (20% of time) - Cross-reference findings
- Citation Verification (15% of time) - Verify all sources
- Synthesis & Reporting (10% of time) - Compile final report
Anti-Hallucination Protocol
When ANTI_HALLUCINATION_LEVEL=high:
- ✅ Only cite verified sources
- ✅ Always provide URLs
- ✅ Flag uncertain information with confidence scores
- ✅ Cross-reference all claims
- ❌ Never generate speculative data
- ❌ Never create fake citations
AgentDB Self-Learning
Complete ReasoningBank integration with local SQLite storage:
Pattern Storage:
- Automatic reward calculation based on quality metrics
- Success/failure tracking with confidence scores
- Critique generation for continuous improvement
- Latency and token usage tracking
Memory Distillation:
- Automated knowledge compression from multiple patterns
- Category-based grouping (AI/ML, Cloud, Technology, etc.)
- Key insights, success factors, and failure patterns extraction
- Best practices identification and storage
Pattern Associations:
- Similarity-based linking between patterns (0-1 score)
- Association types: similar, complementary, contrasting, sequential
- Learning value calculation for knowledge transfer
- Cross-pattern analysis for improved recommendations
Learning Episodes:
- Performance tracking over time with verdicts (success/failure/partial/retry)
- Judgment scores and improvement rates
- Temporal trend analysis
- Continuous performance optimization
Vector Embeddings:
- HNSW multi-level graph for 150x faster search
- Content hashing for deduplication
- Semantic similarity matching
- Source type filtering (pattern/episode/task/report)
Federation Capabilities
Docker-based federated agent coordination:
- Distribute research across multiple nodes
- QUIC protocol for fast coordination
- Fault-tolerant with automatic failover
- Scales to hundreds of concurrent research tasks
🎯 MCP Server
Research Swarm provides a Model Context Protocol server with 6 tools:
Available MCP Tools
- research_swarm_init - Initialize database
- research_swarm_create_job - Create research job
- research_swarm_start_job - Start job execution
- research_swarm_get_job - Get job status
- research_swarm_list_jobs - List all jobs
- research_swarm_update_progress - Update job progress
Start MCP Server
# stdio mode (default)
research-swarm mcp
# HTTP/SSE mode
research-swarm mcp http --port 3000MCP Integration
Add to your Claude Desktop or other MCP clients:
{
"mcpServers": {
"research-swarm": {
"command": "npx",
"args": ["@agentic-flow/research-swarm", "mcp"]
}
}
}📊 Database Schema
SQLite database at ./data/research-jobs.db:
CREATE TABLE research_jobs (
id TEXT PRIMARY KEY, -- UUID
agent TEXT NOT NULL, -- Agent name
task TEXT NOT NULL, -- Research task
status TEXT, -- pending|running|completed|failed
progress INTEGER, -- 0-100%
current_message TEXT, -- Status message
execution_log TEXT, -- Full logs
report_content TEXT, -- Generated report
report_format TEXT, -- markdown|json|html
duration_seconds INTEGER, -- Execution time
grounding_score REAL, -- Quality score
created_at TEXT, -- Timestamps
completed_at TEXT,
-- ... and 15 more fields
);🔧 CLI Commands
# Research
research-swarm research <agent> "<task>" [options]
-d, --depth <1-10> Research depth
-t, --time <minutes> Time budget
-f, --focus <mode> Focus mode (narrow|balanced|broad)
--anti-hallucination <level> Verification level
--no-citations Disable citations
--no-ed2551 Disable enhanced mode
# Jobs
research-swarm list [options]
-s, --status <status> Filter by status
-l, --limit <number> Limit results
research-swarm view <job-id> View job details
# AgentDB Learning
research-swarm learn Run learning session (memory distillation)
--min-patterns <number> Minimum patterns required (default: 2)
research-swarm stats Show AgentDB learning statistics
research-swarm benchmark Run ReasoningBank performance benchmark
--iterations <number> Number of iterations (default: 10)
# Parallel Swarm
research-swarm swarm "<task1>" "<task2>" ...
-a, --agent <name> Agent type (default: researcher)
-c, --concurrent <number> Max concurrent tasks (default: 3)
# HNSW Vector Search
research-swarm hnsw:init Initialize HNSW index
-M <number> Connections per layer (default: 16)
--ef-construction <number> Search depth (default: 200)
--max-layers <number> Maximum layers (default: 5)
research-swarm hnsw:build Build HNSW graph from vectors
--batch-size <number> Vectors per batch (default: 100)
research-swarm hnsw:search "<query>" Search similar vectors
-k <number> Number of results (default: 5)
--ef <number> Search depth (default: 50)
--source-type <type> Filter by source type
research-swarm hnsw:stats Show HNSW graph statistics
# System
research-swarm init Initialize database
research-swarm mcp [mode] Start MCP server
research-swarm --help Show help
research-swarm --version Show version🎓 Examples
Quick Research Task
RESEARCH_DEPTH=3 \
RESEARCH_TIME_BUDGET=30 \
research-swarm research researcher "What are webhooks?"Deep Analysis
RESEARCH_DEPTH=8 \
RESEARCH_TIME_BUDGET=240 \
RESEARCH_FOCUS=broad \
ANTI_HALLUCINATION_LEVEL=high \
CITATION_REQUIRED=true \
research-swarm research researcher "Comprehensive AI safety analysis"Using OpenRouter
# Set in .env or environment
PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-...
COMPLETION_MODEL=anthropic/claude-3.5-sonnet
research-swarm research researcher "Your task"Using Google Gemini
PROVIDER=gemini
GOOGLE_GEMINI_API_KEY=AIza...
COMPLETION_MODEL=gemini-2.0-flash-exp
research-swarm research researcher "Your task"Parallel Swarm Execution
# Run 3 research tasks concurrently
research-swarm swarm \
"Cloud computing trends 2024" \
"Machine learning vs deep learning" \
"TypeScript benefits" \
--concurrent 3
# Automatically triggers learning session when 2+ tasks completeLearning Session & Statistics
# Run manual learning session
research-swarm learn --min-patterns 3
# View learning statistics
research-swarm stats
# Performance benchmark
research-swarm benchmark --iterations 20HNSW Vector Search
# Initialize and build HNSW graph
research-swarm hnsw:init
research-swarm hnsw:build --batch-size 50
# Search for similar research
research-swarm hnsw:search "machine learning trends" -k 10
# View graph statistics
research-swarm hnsw:stats📦 Package Exports
// ES Modules
import { createJob, getJobStatus, getJobs } from '@agentic-flow/research-swarm/db';
import { storeResearchPattern } from '@agentic-flow/research-swarm/reasoningbank';
// Create a job
createJob({
id: 'my-job-123',
agent: 'researcher',
task: 'My research task'
});
// Get status
const job = getJobStatus('my-job-123');
console.log(job.progress); // 0-100🛡️ Security
- ✅ No hardcoded credentials
- ✅ API keys via environment variables
- ✅ Input validation on all commands
- ✅ SQL injection protection (parameterized queries)
- ✅ Process isolation for research tasks
- ✅ Sandboxed execution environment
📝 License
ISC License - Copyright (c) 2025 rUv
🤝 Contributing
Contributions welcome! This maintains local-first, no-cloud-services architecture.
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
📞 Support
- 🐛 Report Issues
- 📖 Documentation
- 💬 Discussions
- 🌐 Website
🔗 Related Projects
- agentic-flow - AI agent orchestration framework
- AgentDB - Vector database with ReasoningBank
- Claude Code - Claude's official CLI
Built with ❤️ using Claude Sonnet 4.5 and agentic-flow