Package Exports

ai-embed-search

Readme

🔍 AI-embed-search — Lightweight AI Semantic Search Engine

Smart. Simple. Local.
AI-powered semantic search in TypeScript using transformer embeddings. No cloud, no API keys — 100% offline.

🚀 Features

🧠 AI-powered semantic understanding
✨ Super simple API: init, embed, search, clear
⚡️ Fast cosine similarity-based retrieval
📦 In-memory vector store (no DB required)
🧩 Save/load vectors to JSON file
🔍 Search filters, caching, batch embed & probabilistic softmax ranking
🧰 CLI-ready architecture
🌐 Fully offline via @xenova/transformers (WASM/Node)

📦 Installation

  npm install ai-embed-search

  yarn add ai-embed-search

Requires Node.js ≥ 18 or a modern browser for WASM.

⚡ Quick Start

import {embed, search, createEmbedder, initEmbedder} from 'ai-embed-search';

const embedder = await createEmbedder();
await initEmbedder({ embedder });

await embed([
  { id: '1', text: 'iPhone 15 Pro Max' },
  { id: '2', text: 'Samsung Galaxy S24 Ultra' },
  { id: '3', text: 'Apple MacBook Pro' }
]);

const results = await search('apple phone', 2).exec();
console.log(results);

Result:

[
  { id: '1', text: 'iPhone 15 Pro Max', score: 0.95 },
  { id: '3', text: 'Apple MacBook Pro', score: 0.85 }
]

🧠 1. Initialize the Embedding Model

import { createEmbedder, initEmbedder } from 'ai-embed-search';

const embedder = await createEmbedder();
await initEmbedder({ embedder });

Loads the MiniLM model via @xenova/transformers. Required once at startup.

📥 2. Add Items to the Vector Store

import { embed } from 'ai-embed-search';

await embed([
  { id: 'a1', text: 'Tesla Model S' },
  { id: 'a2', text: 'Electric Vehicle by Tesla' }
]);

Embeds and stores vector representations of the given items.

🔍 3. Perform Semantic Search

import { search } from 'ai-embed-search';

const results = await search('fast electric car', 3).exec();

Returns:

[
  { id: 'a1', text: 'Tesla Model S', score: 0.95 },
  { id: 'a2', text: 'Electric Vehicle by Tesla', score: 0.85 }
]

📦 4. Search with Metadata

You can add metadata to each item:

const laptops = await search('computer', 5)
    .filter(r => r.meta?.type === 'laptop')
    .exec();

💾 5. Search with Cached Embeddings (Advanced)

You can store precomputed embeddings in your own DB or file:

const precomputed = {
  id: 'x1',
  text: 'Apple Watch Series 9',
  vector: [0.11, 0.32, ...] // 384-dim array
};

Then use cosine similarity to search across them, or build your own vector store using ai-embed-search functions.

🧹 6. Clear the Vector Store

import { removeVector, clearVectors } from 'ai-embed-search';

removeVector('a1');     // Remove by ID
clearVectors();         // Clear all vectors

🤝 7. Find Similar Items

You can retrieve the most semantically similar items to an existing one in the vector store:

import { getSimilarItems } from 'ai-embed-search';

const similar = await getSimilarItems('1', 3);
console.log(similar);

Result:

[
    { id: '2', text: 'Samsung Galaxy S24 Ultra smartphone', score: 0.93 },
    { id: '3', text: 'Apple MacBook Air M3 laptop', score: 0.87 },
    { id: '5', text: 'Dell XPS 13 ultrabook', score: 0.85 }
]

This is useful for recommendation systems, "related items" features, or clustering.

🔥 8. Probabilistic Search with Softmax Ranking

You can rank search results probabilistically using a temperature-scaled softmax over cosine similarity:

import { searchWithSoftmax } from 'ai-embed-search';

const results = await searchWithSoftmax('apple wearable', 5, 0.7);
console.log(results);

Result:

[
    {
        id: '9',
        text: 'Apple Watch Ultra 2',
        score: 0.812,
        probability: 0.39,
        confidence: 0.82
    },
    {
        id: '3',
        text: 'Apple Vision Pro',
        score: 0.772,
        probability: 0.31,
        confidence: 0.82
    },
    {
        id: '1',
        text: 'iPhone 15 Pro Max',
        score: 0.695,
        probability: 0.18,
        confidence: 0.82
    },
    ...
]

How It Works:

Cosine similarities between the query and each item are computed.
The scores are scaled by a temperature T and passed through the softmax function:

softmax(sᵢ) = exp(sᵢ / T) / ∑ⱼ exp(sⱼ / T)

Where sᵢ is the similarity score for item i, and T is the temperature parameter. 3. We compute the entropy H(p) of the resulting probability distribution:

H(p) = -∑ᵢ pᵢ log(pᵢ)

This measures the uncertainty in the result:

Low entropy ⇒ confident, peaked distribution
High entropy ⇒ uncertain, flat distribution

We normalize entropy to get a confidence score between 0 and 1:

confidence = 1 - (H(p) / log(N))

Where n is the number of candidates (the maximum entropy is log(n)).

🔥 Temperature Intuition

Temperature	Behavior	Use Case
0.1–0.5	Very sharp, top-1 dominates	Deterministic ranking
1.0	Balanced	Ranked probabilities
1.5+	Softer, more diverse	Random sampling / fallback

📌 Use Cases

✅ Probabilistic ranking — get soft scores for relevance
🎯 Sampling — return one of top-k randomly with smart weighting
🧠 Uncertainty estimation — use entropy/confidence to inform users
⚡️ Hybrid search — combine softmax scores with metadata (e.g., tags, categories, prices)

🔁 9. Query Expansion via Embedding Neighbors

Query Expansion improves recall and relevance by augmenting the query with its nearest semantic neighbors. Instead of matching only the raw query embedding, we blend it with embeddings of the top-N similar items to form an expanded query vector.

import { searchWithExpansion } from 'ai-embed-search';

const results = await searchWithExpansion('ai car', 5, 3);
console.log(results);

Example output:

[
    { id: '1', text: 'Tesla Model S', score: 0.88 },
    { id: '2', text: 'Electric Vehicle by Tesla', score: 0.85 },
    { id: '3', text: 'Nissan Leaf EV', score: 0.80 }
]

How It Works:

Embed the query: v₀ = embed(query)
Find top-k nearest items in the vector store (based on cosine similarity).
Average their vectors with the query vector:

v_expanded = (v₀ + ∑ᵢ vᵢ) / (1 + k)

Perform final search using v_expanded.

This process makes vague queries like "ai car" match "Tesla", "EV", or "autopilot" even if those words are not directly in the query.

📖 API Reference

`initEmbedder()`

Initializes the embedding model. Must be called once before using embed or search.

`embed(items: { id: string, text: string }[])`

Embeds and stores the provided items in the vector store. Each item must have a unique id and text.

`search(query: string, limit: number)`

Performs a semantic search for the given query. Returns up to limit results sorted by similarity score (default is 5).

`getSimilarItems(id: string,, limit: number)`

Finds the most similar items to the one with the given id. Returns up to limit results sorted by similarity score.

`cacheFor(limit: number)`

Caches the embeddings for the next limit search queries. This is useful for optimizing performance when you know you'll be searching multiple times.

`clearStore()`

Clears all embedded data from the vector store, freeing up memory.

`searchWithSoftmax(query: string, limit: number, temperature: number)`

Performs a probabilistic search using softmax ranking. The temperature parameter controls the distribution sharpness:

`searchWithExpansion(query: string, limit: number, neighbors: number)`

Search using an expanded query vector formed by blending the input with its neighbors most similar vectors. Useful for handling vague or underdefined queries.

🔧 Development

Model: MiniLM via @xenova/transformers
Vector type: 384-dim float32 array
Similarity: Cosine similarity
Storage: In-memory vector store (no database required)
On-premises: Fully offline, no cloud dependencies

🌐 SEO Keywords

ai search, semantic search, local ai search, vector search, transformer embeddings, cosine similarity, open source search engine, text embeddings, in-memory search, local search engine, typescript search engine, fast npm search, embeddings in JS, ai search npm package

License

Contributing

Contributions are welcome! Please open an issue or submit a pull request.