Package Exports
- thoughtlayer
- thoughtlayer/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (thoughtlayer) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
ThoughtLayer — Memory Infrastructure for AI Agents
Remember everything. Retrieve what matters.
ThoughtLayer is a local-first, cloud-optional memory layer for AI agents. It replaces naive "dump everything into LLM context" approaches with a proper retrieval pipeline: vector search + keyword search + freshness decay + importance scoring.
Why ThoughtLayer?
- Every AI agent has amnesia. Context windows are finite. Sessions end. Knowledge vanishes.
- Existing solutions are cloud-locked, expensive, or don't actually search.
- ThoughtLayer gives you real retrieval at constant cost, regardless of corpus size.
Quick Start
# Install
npm install -g thoughtlayer
# Initialise in your project
cd your-project
thoughtlayer init
# Ingest your docs
thoughtlayer ingest ./docs
# Query (works immediately — no API keys needed)
thoughtlayer query "what database are we using"
# Check what's indexed
thoughtlayer status
thoughtlayer healthThat's it. No API keys, no external services, no Ollama. ThoughtLayer works out of the box with a hybrid keyword search engine that hits 92.5% Recall@1.
Want better semantic search? Add an embedding provider:
# Option A: Local embeddings (free, private)
ollama pull nomic-embed-text
thoughtlayer init --embedding-provider ollama
# Option B: Cloud embeddings ($0.02/1M tokens)
export OPENAI_API_KEY=sk-...
thoughtlayer init --embedding-provider openaiWant LLM-powered knowledge extraction from raw text?
export ANTHROPIC_API_KEY=sk-ant-...
thoughtlayer curate "We decided to use PostgreSQL because of pgvector support."How It Works
The Retrieval Pipeline (The Moat)
Most tools dump ALL files into LLM context. At 50 files, that's 10K+ tokens per query. At 200 files, it breaks.
ThoughtLayer finds the 5 most relevant entries out of 10,000 in <5ms (local) or ~230ms (with embeddings):
Query → ┬→ Keyword Search (FTS5 + BM25, stopword-filtered OR queries)
├→ Vector Search (cosine similarity — optional, if embeddings configured)
├→ Query Term Overlap (lightweight cross-domain signal)
└→ Metadata Filter (domain, tags, importance)
↓
Reciprocal Rank Fusion (combines rankings, multi-list bonus)
↓
Freshness Decay + Importance Weighting
↓
Top-K Results (scored, with source breakdown)Works without any API keys. The keyword engine alone hits 92.5% Recall@1 on our benchmark. Add embeddings for the last few percent.
Cost comparison:
| Approach | 10 files | 100 files | 1,000 files | 10,000 files |
|---|---|---|---|---|
| Dump all (naive) | ~$0.004 | ~$0.04 | ~$0.40 | Breaks |
| ThoughtLayer (retrieve) | ~$0.002 | ~$0.002 | ~$0.002 | ~$0.002 |
Storage
Everything is stored locally by default:
your-project/
└── .thoughtlayer/
├── config.json # Project config
├── knowledge/ # Markdown files (human-readable source of truth)
│ ├── authentication/
│ │ └── jwt_strategy.md
│ └── database/
│ └── postgresql_choice.md
└── index/
└── metadata.db # SQLite (FTS5 + metadata + embeddings)- Markdown files are the source of truth. Human-readable, git-friendly, with YAML frontmatter.
- SQLite handles search indexes (FTS5 for keywords, embeddings for vectors).
- No external database required. Everything runs locally.
Knowledge Entries
Each entry has:
---
id: 019ce0d7-fd11-7158-b2d7-913c150d8828
title: "JWT Refresh Token Strategy"
domain: authentication
topic: jwt
importance: 0.8 # 0.0 (trivia) to 1.0 (critical)
confidence: 0.9 # 0.0 (speculation) to 1.0 (verified)
tags: ["security", "auth"]
keywords: ["jwt", "refresh", "token"]
status: active
version: 1
freshness_at: 2026-03-12T07:00:00.000Z
---
# JWT Refresh Token Strategy
Refresh tokens expire after 7 days. Use rotating refresh tokens for security.
## Facts
- Refresh token TTL is 7 days
- Rotating refresh tokens invalidate previous tokens on useCLI Reference
| Command | Description |
|---|---|
thoughtlayer init |
Initialise a new project |
thoughtlayer ingest <dir> |
Scan and ingest files (dedup, change detection) |
thoughtlayer ingest <dir> --watch |
Watch directory for changes |
thoughtlayer query <query> |
Hybrid search (keyword + vector if configured) |
thoughtlayer search <term> |
Keyword-only search (FTS5) |
thoughtlayer add <content> |
Add a manual entry |
thoughtlayer curate <text> |
LLM-powered knowledge extraction |
thoughtlayer list |
List entries with filters |
thoughtlayer status |
Ingestion status, tracked files |
thoughtlayer health |
Knowledge health metrics |
Common Options
thoughtlayer query "auth flow" --top-k 3 # Limit results
thoughtlayer query "auth flow" --domain auth # Filter by domain
thoughtlayer query "auth flow" --json # JSON output
thoughtlayer list --domain health --limit 10 # List with filters
thoughtlayer add --domain ops --title "Deploy Process" --importance 0.9 --tags "devops,ci" "Content here"TypeScript SDK
import { ThoughtLayer } from 'thoughtlayer';
const memory = new ThoughtLayer({
projectRoot: '/path/to/project',
embedding: {
provider: 'openai',
apiKey: process.env.OPENAI_API_KEY!,
},
curate: {
provider: 'anthropic',
apiKey: process.env.ANTHROPIC_API_KEY!,
},
});
// Add knowledge
await memory.add({
domain: 'architecture',
title: 'Database Choice',
content: 'Using PostgreSQL with pgvector for embeddings.',
importance: 0.8,
tags: ['database', 'architecture'],
});
// Curate from text (LLM extracts structured knowledge)
const { entries } = await memory.curate(
'We switched from REST to GraphQL for the mobile API because of bandwidth constraints.'
);
// Query (vector + keyword + freshness)
const results = await memory.query('what API do we use for mobile');
for (const r of results) {
console.log(`${r.entry.title} (score: ${r.score})`);
console.log(` ${r.entry.content.slice(0, 100)}...`);
}
// Keyword search (no embeddings needed)
const ftsResults = memory.search('graphql mobile');
// Health check
const health = memory.health();
// { total: 42, active: 40, stale: 2, domains: { api: 5, auth: 8, ... } }Configuration
Initialisation
thoughtlayer init # Defaults (OpenAI embeddings)
thoughtlayer init --embedding-provider openai # Explicit provider
thoughtlayer init --curate-provider anthropic # Claude for curate
thoughtlayer init --curate-provider openai --curate-model gpt-4o-mini # GPT for curateEnvironment Variables
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
For embeddings | OpenAI API key (text-embedding-3-small) |
ANTHROPIC_API_KEY |
For curate (if using Claude) | Anthropic API key |
Config File (.thoughtlayer/config.json)
{
"version": 1,
"embedding": {
"provider": "openai"
},
"curate": {
"provider": "anthropic",
"model": "claude-sonnet-4-20250514"
}
}API keys are loaded from environment variables, never stored in config.
BYOLLM (Bring Your Own LLM)
ThoughtLayer never locks you into a specific provider. The curate operation (LLM-powered knowledge extraction) works with:
| Provider | Config | Notes |
|---|---|---|
| Anthropic | provider: "anthropic" |
Claude Sonnet default, best quality |
| OpenAI | provider: "openai" |
GPT-4o-mini default, cheapest |
| OpenRouter | provider: "openrouter" |
Any model via OpenRouter |
| AWS Bedrock | Coming Phase 2 | For AWS-native deployments |
| Local (Ollama) | Coming Phase 1 | For fully offline curate |
Embeddings currently use OpenAI text-embedding-3-small ($0.02/1M tokens). Local embedding support (Nomic) coming in Phase 1.
Scoring & Retrieval
Results are scored using a weighted combination:
| Signal | Weight | Description |
|---|---|---|
| Vector similarity | 0.35 | Cosine similarity between query and entry embeddings |
| Keyword match (BM25) | 0.35 | FTS5 full-text search ranking |
| Freshness | 0.10 | Exponential decay (half-life: 30 days) |
| Importance | 0.20 | Entry importance score (0.0-1.0) |
Weights are configurable per query:
const results = await memory.query('auth flow', {
weights: { vector: 0.5, fts: 0.2, freshness: 0.1, importance: 0.2 },
topK: 5,
domain: 'authentication',
});Architecture
┌──────────────────────────────────────────────────────┐
│ Interfaces │
│ CLI MCP Server REST API TS SDK Py SDK │
├──────────────────────────────────────────────────────┤
│ Core Engine │
│ ┌───────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Ingest │ │ Retrieve │ │ Lifecycle │ │
│ │ Engine │ │ Pipeline │ │ Engine │ │
│ └─────┬─────┘ └──────┬───────┘ └────────┬────────┘ │
├────────┼───────────────┼──────────────────┼──────────┤
│ │ Storage Layer │ │
│ ┌─────▼─────┐ ┌──────▼───────┐ ┌────────▼────────┐ │
│ │ Knowledge │ │ Embedding │ │ Metadata │ │
│ │ Store │ │ Index │ │ Index │ │
│ │(Markdown) │ │ (vectors) │ │ (SQLite/FTS5) │ │
│ └───────────┘ └──────────────┘ └─────────────────┘ │
├──────────────────────────────────────────────────────┤
│ Sync Layer (Optional) │
│ Local ←→ Supabase/S3 ←→ Other Devices/Team │
└──────────────────────────────────────────────────────┘Development
# Clone
git clone https://github.com/prasants/thoughtlayer.git
cd thoughtlayer
# Install
npm install --include=dev
# Build
npx tsc
# Test
npx vitest run
# Run retrieval quality tests
npx tsx scripts/test-retrieval.tsProject Structure
thoughtlayer/
├── src/
│ ├── index.ts # Public exports
│ ├── thoughtlayer.ts # High-level API
│ ├── cli/
│ │ └── index.ts # CLI (commander)
│ ├── storage/
│ │ ├── database.ts # SQLite + FTS5
│ │ └── schema.ts # Table definitions
│ ├── ingest/
│ │ └── curate.ts # LLM knowledge extraction
│ └── retrieve/
│ ├── pipeline.ts # Retrieval pipeline (the moat)
│ ├── vector.ts # Cosine similarity search
│ └── embeddings.ts # Embedding providers
├── tests/
│ ├── storage.test.ts # 9 storage tests
│ └── vector.test.ts # 4 vector tests
├── scripts/
│ ├── seed-knowledge.ts # Seed example knowledge
│ └── test-retrieval.ts # 27 retrieval quality tests (96.3% pass rate)
├── package.json
└── tsconfig.jsonRoadmap
- Core engine — storage, retrieval pipeline, CLI, MCP server
- npm package, GitHub launch
- File ingestion with dedup, change detection, watch mode
- Local embedding support (Ollama/Nomic) + auto-detect
- Docs site (thoughtlayer.sh/docs)
- 92.5% Recall@1 without embeddings, 96.5% MRR with embeddings
- Cloud sync (Supabase), web dashboard, billing
- Enterprise (SSO, audit log, self-hosted Docker/Helm)
Open Core
The core engine is MIT-licensed and free forever. Run it locally, embed it in your agents, ship it in your products.
ThoughtLayer Cloud (coming soon) adds team features: shared knowledge bases, hosted embeddings, dashboard, analytics, and managed sync. Details at thoughtlayer.sh.
License
MIT. See LICENSE.
Built by Prasant Sudhakaran.