JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 1
  • Score
    100M100P100Q29556F
  • License MIT

🔍 A local semantic caching library for Node.js.

Package Exports

  • seekmix
  • seekmix/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (seekmix) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

SeekMix

SeekMix is a powerful semantic caching library for Node.js that leverages vector embeddings to cache and retrieve semantically similar queries, significantly reducing API calls to expensive LLM services.

Features

  • Semantic Caching: Cache results based on the semantic meaning of queries, not just exact matches
  • Configurable Similarity Threshold: Fine-tune how semantically similar queries need to be for a cache hit
  • Local Embedding Models: By default, SeekMix uses Hugging Face embedding models locally, reducing external API dependencies
  • Multiple Embedding Providers: Support for OpenAI and Hugging Face embedding models
  • Redis Vector Database: Leverages Redis as a vector database for efficient similarity search
  • Time-based Invalidation: Easily invalidate old cache entries based on time criteria
  • TTL Support: Configure time-to-live for all cache entries

Benefits

  • Cost Reduction: Minimize expensive API calls to Large Language Models
  • Improved Response Times: Retrieve cached results for semantically similar queries instantly
  • Perfect for RAG Applications: Ideal for Retrieval-Augmented Generation systems
  • Flexible Configuration: Adapt to your specific use case with multiple configuration options
  • Multi-model Support: Use with OpenAI or open-source Hugging Face models

Requirements

  • Node.js (>= 14.x)
  • Redis with RedisSearch and RedisJSON modules enabled (Redis Stack recommended)
  • Disk space for locally downloaded Hugging Face embedding models

Installation with Redis Stack (Docker)

docker run -d --name redis-stack-server -p 6379:6379 redis/
redis-stack-server:latest

npm install seekmix

Basic Usage

const { SeekMix, OpenAIEmbeddingProvider } = require('seekmix');

// Function that simulates an expensive API call (e.g., to an LLM)
async function expensiveApiCall(query) {
    console.log(`Making expensive API call for: "${query}"`);
    // Simulate processing time
    await new Promise(resolve => setTimeout(resolve, 1000));

    // In a real-world scenario, this would be a call to an API like GPT-X
    return `Response for: ${query} - ${new Date().toISOString()}`;
}

// Create and initialize the semantic cache
const cache = new SeekMix({
    similarityThreshold: 0.9, // Semantic similarity threshold
    ttl: 60 * 60, // 1 hour TTL
    // embeddingProvider: new OpenAIEmbeddingProvider()
});

await cache.connect();
console.log('Semantic cache connected successfully');

// Examples of semantically similar queries
const queries = [
    'What are the best restaurants in New York',
    'Recommend places to eat in New York',
    'I need information about restaurants in Chicago',
    'Looking for good dining spots in New York',
    'Tell me about hiking trails'
];

// Process queries, using the cache when possible
for (const query of queries) {
    console.log(`\nProcessing query: "${query}"`);

    // Try to get from cache
    const cachedResult = await cache.get(query);

    if (cachedResult) {
        console.log(`✅ CACHE HIT - Similarity: ${(1 - cachedResult.score).toFixed(4)}`);
        console.log(`Original query: "${cachedResult.query}"`);
        console.log(`Result: ${cachedResult.result}`);
        console.log(`Stored: ${Math.round((Date.now() - cachedResult.timestamp) / 1000)} seconds ago`);
    } else {
        console.log('❌ CACHE MISS - Making API call');

        // Make the expensive call
        const result = await expensiveApiCall(query);

        // Save to cache for future similar queries
        await cache.set(query, result);
        console.log(`Result: ${result}`);
        console.log('Saved to cache for future similar queries');
    }
}

await cache.disconnect();

Advanced Configuration

const { SeekMix, OpenAIEmbeddingProvider } = require('seekmix');

// Create a semantic cache with OpenAI embeddings and custom settings
const cache = new SeekMix({
  redisUrl: 'redis://username:password@your-redis-host:6379',
  indexName: 'my-app:semantic-cache',
  keyPrefix: 'my-app:cache:',
  ttl: 60 * 60 * 24 * 7, // 1 week
  similarityThreshold: 0.85,
  dropIndex: false, // Set to true to recreate the index on connect
  dropKeys: false, // Set to true to clear all cache entries on connect
  embeddingProvider: new OpenAIEmbeddingProvider({
    model: 'text-embedding-ada-002',
    apiKey: process.env.OPENAI_API_KEY
  })
});

Using with RAG Applications

SeekMix is perfect for Retrieval-Augmented Generation applications, as it can cache both the retrieval and generation steps:

// Caching the retrieval step
const retrievalCache = new SeekMix({ keyPrefix: 'rag:retrieval:' });
await retrievalCache.connect();

// Caching the generation step
const generationCache = new SeekMix({ keyPrefix: 'rag:generation:' });
await generationCache.connect();

async function queryRAG(userQuestion) {
  // 1. Try to get the final answer from generation cache
  const cachedAnswer = await generationCache.get(userQuestion);
  if (cachedAnswer) return cachedAnswer.result;

  // 2. Try to get retrieved context from retrieval cache
  let context;
  const cachedRetrieval = await retrievalCache.get(userQuestion);
  
  if (cachedRetrieval) {
    context = cachedRetrieval.result;
  } else {
    // Perform actual retrieval from vector DB
    context = await retrieveDocuments(userQuestion);
    // Cache the retrieval results
    await retrievalCache.set(userQuestion, context);
  }

  // 3. Generate answer using LLM
  const answer = await generateAnswer(context, userQuestion);
  
  // 4. Cache the final answer
  await generationCache.set(userQuestion, answer);
  
  return answer;
}

Invalidating Old Cache Entries

You can manually invalidate old cache entries:

// Invalidate entries older than 1 hour
const invalidated = await cache.invalidateOld(60 * 60);
console.log(`Invalidated ${invalidated} old cache entries`);

License

MIT