🔍 AI-embed-search — Lightweight AI Semantic Search Engine

Smart. Simple. Local.
AI-powered semantic search in TypeScript using transformer embeddings. No cloud, no API keys — 100% offline.

🚀 Features

🧠 AI-powered semantic understanding
✨ Super simple API: init, embed, search, clear
⚡️ Fast cosine similarity-based retrieval
📦 In-memory vector store (no DB required)
🌐 Fully offline via @xenova/transformers (WASM/Node)

📦 Installation

  npm install ai-embed-search

Requires Node.js ≥ 18 or a modern browser for WASM.

⚡ Quick Start

import { initEmbedder, embed, search } from 'searchwizard';

await initEmbedder();

await embed([
  { id: '1', text: 'iPhone 15 Pro Max' },
  { id: '2', text: 'Samsung Galaxy S24 Ultra' },
  { id: '3', text: 'Apple MacBook Pro' }
]);

const results = await search('apple phone', 2);
console.log(results);
/*
[
  { id: '1', text: 'iPhone 15 Pro Max', score: 0.92 },
  { id: '3', text: 'Apple MacBook Pro', score: 0.75 }
]
*/

🧠 1. Initialize the Embedding Model

import { initEmbedder } from 'searchwizard';

await initEmbedder();

Loads the MiniLM model via @xenova/transformers. Required once at startup.

📥 2. Add Items to the Vector Store

import { embed } from 'searchwizard';

await embed([
  { id: 'a1', text: 'Tesla Model S' },
  { id: 'a2', text: 'Electric Vehicle by Tesla' }
]);

Embeds and stores vector representations of the given items.

🔍 3. Perform Semantic Search

import { search } from 'searchwizard';

const results = await search('fast electric car', 3);

Returns:

[
  { id: 'a1', text: 'Tesla Model S', score: 0.95 },
  { id: 'a2', text: 'Electric Vehicle by Tesla', score: 0.85 }
]

🎯 4. Create Embeddings Manually (Optional)

const embedder = await initEmbedder();
const output = await embedder('custom text', { pooling: 'mean', normalize: true });
const vector = Array.from(output.data); // number[]

🧮 5. Use Cosine Similarity Between Vectors

import { cosineSimilarity } from 'searchwizard';

const score = cosineSimilarity(vectorA, vectorB);

💾 6. Search with Cached Embeddings (Advanced)

You can store precomputed embeddings in your own DB or file:

const precomputed = {
  id: 'x1',
  text: 'Apple Watch Series 9',
  vector: [0.11, 0.32, ...] // 384-dim array
};

Then use cosine similarity to search across them, or build your own vector store using searchwizard functions.

🧹 7. Clear the Vector Store

import { clearStore } from 'searchwizard';

clearStore(); // Removes all embedded data from memory

📖 API Reference

`initEmbedder()`

Initializes the embedding model. Must be called once before using embed or search.

`embed(items: { id: string, text: string }[])`

Embeds and stores the provided items in the vector store. Each item must have a unique id and text.

`search(query: string)`

Performs a semantic search for the given query. Returns up to limit results sorted by similarity score (default is 5).

`cacheFor(limit: number)`

Caches the embeddings for the next limit search queries. This is useful for optimizing performance when you know you'll be searching multiple times.

`clearStore()`

Clears all embedded data from the vector store, freeing up memory.

🔧 Development

Model: MiniLM via @xenova/transformers
Vector type: 384-dim float32 array
Similarity: Cosine similarity
Storage: In-memory vector store (no database required)
On-premises: Fully offline, no cloud dependencies

🌐 SEO Keywords

semantic search, ai search, local ai search, open source search engine, transformer embeddings, cosine similarity, vector search, text embeddings, typescript search engine

License

Contributing

Contributions are welcome! Please open an issue or submit a pull request.