Package Exports
- @asktext/core
- @asktext/core/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@asktext/core) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
@asktext/core
TypeScript-first embedding and retrieval engine for voice-enabled Q&A on articles.
What it does
- Text processing: Splits HTML/Markdown into semantic chunks with configurable overlap
- Embeddings: Generates OpenAI embeddings for each chunk
- Storage: Saves chunks + embeddings to your database (Prisma JSON, pgvector, or custom)
- Retrieval: Semantic search to find relevant passages for user questions
Installation
npm install @asktext/core openai @prisma/clientQuick Start
1. Database Schema
Add to your schema.prisma:
model ArticleChunk {
id String @id @default(cuid())
postId String
chunkIndex Int
content String @db.Text
startChar Int
endChar Int
embedding String @db.Text // JSON-encoded float[]
@@index([postId, chunkIndex])
}Run npx prisma db push.
2. Embed Articles
import { PrismaClient } from '@prisma/client';
import { OpenAIEmbedder, embedAndStore } from '@asktext/core';
const prisma = new PrismaClient();
const store = embedAndStore.createPrismaJsonStore(prisma);
const embedder = new OpenAIEmbedder({
apiKey: process.env.OPENAI_API_KEY!
});
// Call this when publishing/updating articles
export async function saveEmbeddings(postId: string, htmlContent: string) {
await embedAndStore({
articleId: postId,
htmlOrMarkdown: htmlContent,
embedder,
store
});
}3. Retrieve Passages
import { retrievePassages } from '@asktext/core';
const passages = await retrievePassages({
query: "How does binary search work?",
store,
embedder,
filter: { postId: "article-123" },
limit: 5
});Configuration
Text Splitting
import { TextSplitter } from '@asktext/core';
const splitter = new TextSplitter({
chunkSize: 1500, // characters per chunk
chunkOverlap: 200, // overlap between chunks
separators: ['\n\n', '\n', '. ', ' '] // split priorities
});Custom Vector Store
Implement the VectorStore interface for your database:
interface VectorStore {
saveChunks(chunks: ChunkWithEmbedding[]): Promise<void>;
searchSimilar(embedding: number[], limit: number, filter?: any): Promise<ChunkWithScore[]>;
deleteByArticleId(articleId: string): Promise<void>;
}Environment Variables
OPENAI_API_KEY=sk-... # Required for embeddings
DATABASE_URL=postgresql://... # For Prisma storeAdvanced Usage
Batch Processing
const articles = await getArticlesToProcess();
for (const article of articles) {
await saveEmbeddings(article.id, article.content);
console.log(`Processed: ${article.title}`);
}Custom Embedder
class CustomEmbedder implements Embedder {
async embed(texts: string[]): Promise<number[][]> {
// Your embedding logic
}
}Cost Estimation
- 100k words ≈ 75k tokens ≈ $0.01 with
text-embedding-3-small - 1M words ≈ 750k tokens ≈ $0.10
License
MIT