Package Exports
- ai-embed-search
Readme
๐ AI-embed-search โ Lightweight AI Semantic Search Engine
Smart. Simple. Local.
AI-powered semantic search in TypeScript using transformer embeddings. No cloud, no API keys โ 100% offline.
๐ Features
- ๐ง AI-powered semantic understanding
- โจ Super simple API:
init,embed,search,clear - โก๏ธ Fast cosine similarity-based retrieval
- ๐ฆ In-memory vector store (no DB required)
- ๐งฉ Save/load vectors to JSON file
- ๐ Search filters, caching, batch embed & probabilistic softmax ranking
- ๐งฐ CLI-ready architecture
- ๐ Fully offline via
@xenova/transformers(WASM/Node)
๐ฆ Installation
npm install ai-embed-searchor
yarn add ai-embed-searchRequires Node.js โฅ 18 or a modern browser for WASM.
โก Quick Start
import {embed, search, createEmbedder, initEmbedder} from 'ai-embed-search';
const embedder = await createEmbedder();
await initEmbedder({ embedder });
await embed([
{ id: '1', text: 'iPhone 15 Pro Max' },
{ id: '2', text: 'Samsung Galaxy S24 Ultra' },
{ id: '3', text: 'Apple MacBook Pro' }
]);
const results = await search('apple phone', 2).exec();
console.log(results);Result:
[
{ id: '1', text: 'iPhone 15 Pro Max', score: 0.95 },
{ id: '3', text: 'Apple MacBook Pro', score: 0.85 }
]๐ง 1. Initialize the Embedding Model
import { createEmbedder, initEmbedder } from 'ai-embed-search';
const embedder = await createEmbedder();
await initEmbedder({ embedder });Loads the MiniLM model via @xenova/transformers. Required once at startup.
๐ฅ 2. Add Items to the Vector Store
import { embed } from 'ai-embed-search';
await embed([
{ id: 'a1', text: 'Tesla Model S' },
{ id: 'a2', text: 'Electric Vehicle by Tesla' }
]);Embeds and stores vector representations of the given items.
๐ 3. Perform Semantic Search
import { search } from 'ai-embed-search';
const results = await search('fast electric car', 3).exec();Returns:
[
{ id: 'a1', text: 'Tesla Model S', score: 0.95 },
{ id: 'a2', text: 'Electric Vehicle by Tesla', score: 0.85 }
]๐ฆ 4. Search with Metadata
You can add metadata to each item:
const laptops = await search('computer', 5)
.filter(r => r.meta?.type === 'laptop')
.exec();๐พ 5. Search with Cached Embeddings (Advanced)
You can store precomputed embeddings in your own DB or file:
const precomputed = {
id: 'x1',
text: 'Apple Watch Series 9',
vector: [0.11, 0.32, ...] // 384-dim array
};Then use cosine similarity to search across them, or build your own vector store using ai-embed-search functions.
๐งน 6. Clear the Vector Store
import { removeVector, clearVectors } from 'ai-embed-search';
removeVector('a1'); // Remove by ID
clearVectors(); // Clear all vectors๐ค 7. Find Similar Items
You can retrieve the most semantically similar items to an existing one in the vector store:
import { getSimilarItems } from 'ai-embed-search';
const similar = await getSimilarItems('1', 3);
console.log(similar);Result:
[
{ id: '2', text: 'Samsung Galaxy S24 Ultra smartphone', score: 0.93 },
{ id: '3', text: 'Apple MacBook Air M3 laptop', score: 0.87 },
{ id: '5', text: 'Dell XPS 13 ultrabook', score: 0.85 }
]This is useful for recommendation systems, "related items" features, or clustering.
๐ฅ 8. Probabilistic Search with Softmax Ranking
You can rank search results probabilistically using a temperature-scaled softmax over cosine similarity:
import { searchWithSoftmax } from 'ai-embed-search';
const results = await searchWithSoftmax('apple wearable', 5, 0.7);
console.log(results);Result:
[
{
id: '9',
text: 'Apple Watch Ultra 2',
score: 0.812,
probability: 0.39,
confidence: 0.82
},
{
id: '3',
text: 'Apple Vision Pro',
score: 0.772,
probability: 0.31,
confidence: 0.82
},
{
id: '1',
text: 'iPhone 15 Pro Max',
score: 0.695,
probability: 0.18,
confidence: 0.82
},
...
]How It Works:
- Cosine similarities between the query and each item are computed.
- The scores are scaled by a temperature T and passed through the softmax function:
softmax(sแตข) = exp(sแตข / T) / โโฑผ exp(sโฑผ / T)Where sแตข is the similarity score for item i, and T is the temperature parameter.
3. We compute the entropy H(p) of the resulting probability distribution:
H(p) = -โแตข pแตข log(pแตข)This measures the uncertainty in the result:
- Low entropy โ confident, peaked distribution
- High entropy โ uncertain, flat distribution
- We normalize entropy to get a confidence score between 0 and 1:
confidence = 1 - (H(p) / log(N))Where n is the number of candidates (the maximum entropy is log(n)).
๐ฅ Temperature Intuition
| Temperature | Behavior | Use Case |
|---|---|---|
| 0.1โ0.5 | Very sharp, top-1 dominates | Deterministic ranking |
| 1.0 | Balanced | Ranked probabilities |
| 1.5+ | Softer, more diverse | Random sampling / fallback |
๐ Use Cases
- โ Probabilistic ranking โ get soft scores for relevance
- ๐ฏ Sampling โ return one of top-k randomly with smart weighting
- ๐ง Uncertainty estimation โ use entropy/confidence to inform users
- โก๏ธ Hybrid search โ combine softmax scores with metadata (e.g., tags, categories, prices)
๐ 9. Query Expansion via Embedding Neighbors
Query Expansion improves recall and relevance by augmenting the query with its nearest semantic neighbors. Instead of matching only the raw query embedding, we blend it with embeddings of the top-N similar items to form an expanded query vector.
import { searchWithExpansion } from 'ai-embed-search';
const results = await searchWithExpansion('ai car', 5, 3);
console.log(results);Example output:
[
{ id: '1', text: 'Tesla Model S', score: 0.88 },
{ id: '2', text: 'Electric Vehicle by Tesla', score: 0.85 },
{ id: '3', text: 'Nissan Leaf EV', score: 0.80 }
]How It Works:
- Embed the query: vโ = embed(query)
- Find top-k nearest items in the vector store (based on cosine similarity).
- Average their vectors with the query vector:
v_expanded = (vโ + โแตข vแตข) / (1 + k)- Perform final search using v_expanded.
This process makes vague queries like "ai car" match "Tesla", "EV", or "autopilot" even if those words are not directly in the query.
๐ API Reference
initEmbedder()
Initializes the embedding model. Must be called once before using embed or search.
embed(items: { id: string, text: string }[])
Embeds and stores the provided items in the vector store. Each item must have a unique id and text.
search(query: string, limit: number)
Performs a semantic search for the given query. Returns up to limit results sorted by similarity score (default is 5).
getSimilarItems(id: string,, limit: number)
Finds the most similar items to the one with the given id. Returns up to limit results sorted by similarity score.
cacheFor(limit: number)
Caches the embeddings for the next limit search queries. This is useful for optimizing performance when you know you'll be searching multiple times.
clearStore()
Clears all embedded data from the vector store, freeing up memory.
searchWithSoftmax(query: string, limit: number, temperature: number)
Performs a probabilistic search using softmax ranking. The temperature parameter controls the distribution sharpness:
searchWithExpansion(query: string, limit: number, neighbors: number)
Search using an expanded query vector formed by blending the input with its neighbors most similar vectors. Useful for handling vague or underdefined queries.
๐ง Development
- Model: MiniLM via
@xenova/transformers - Vector type: 384-dim float32 array
- Similarity: Cosine similarity
- Storage: In-memory vector store (no database required)
- On-premises: Fully offline, no cloud dependencies
๐ SEO Keywords
ai search, semantic search, local ai search, vector search, transformer embeddings, cosine similarity, open source search engine, text embeddings, in-memory search, local search engine, typescript search engine, fast npm search, embeddings in JS, ai search npm package
License
MIT ยฉ 2025 Peter Sibirtsev
Contributing
Contributions are welcome! Please open an issue or submit a pull request.