JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 30
  • Score
    100M100P100Q61827F
  • License MIT

Core indexing and search functionality for h-codex

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@hpbyte/h-codex-core) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    @hpbyte/h-codex-core

    Core package for h-codex semantic code indexing and search.

    ✨ Features

    • AST-Based Chunking: Parse code using tree-sitter for intelligent chunk boundaries
    • Semantic Embeddings: Generate embeddings using OpenAI text-embedding models
    • File Discovery: Explore codebases with configurable ignore patterns
    • Vector Search: Store and search embeddings in PostgreSQL with pgvector

    🚀 Quick Start

    Installation

    pnpm add @hpbyte/h-codex-core

    Environment Setup

    Create a .env file with:

    LLM_API_KEY=your_llm_api_key_here
    LLM_BASE_URL=your_llm_base_url_here (default is openai baseurl: https://api.openai.com/v1)
    EMBEDDING_MODEL=text-embedding-3-small
    DB_CONNECTION_STRING=postgresql://postgres:password@localhost:5432/h-codex

    Usage Example

    import { indexer, semanticSearch } from '@hpbyte/h-codex-core'
    
    // Index a codebase
    const indexResult = await indexer.index('./path/to/codebase')
    console.log(`Indexed ${indexResult.indexedFiles} files and ${indexResult.totalChunks} code chunks`)
    
    // Search for code
    const searchResults = await semanticSearch.search('database connection implementation')
    console.log(searchResults)

    🛠️ API Reference

    Indexer

    Indexes code repositories by exploring files, chunking code, and generating embeddings.

    const stats = await indexer.index(
      path: string,               // Path to the codebase
      options?: {
        ignorePatterns?: string[], // Additional glob patterns to ignore
        maxChunkSize?: number      // Override default chunk size
      }
    ): Promise<{
      indexedFiles: number,       // Number of indexed files
      totalChunks: number         // Total code chunks created
    }>

    Search indexed code using natural language queries.

    const results = await semanticSearch.search(
      query: string,                // Natural language search query
      options?: {
        limit?: number,             // Max results to return (default: 10)
        threshold?: number          // Minimum similarity score (default: 0.5)
      }
    ): Promise<Array<{
      id: string,                   // Chunk identifier
      content: string,              // Code content
      relativePath: string,         // File path relative to indexed root
      absolutePath: string,         // Absolute file path
      language: string,             // Programming language
      startLine: number,            // Starting line in file
      endLine: number,              // Ending line in file
      score: number                 // Similarity score (0-1)
    }>>

    🏗️ Architecture

    Ingestion Pipeline

    • Explorer (ingestion/explorer/) - Discover files in repositories
    • Chunker (ingestion/chunker/) - Parse and chunk code using AST
    • Embedder (ingestion/embedder/) - Generate semantic embeddings
    • Indexer (ingestion/indexer/) - Orchestrate the full ingestion pipeline

    Storage

    • Repository (storage/repository/) - Database operations for chunks and embeddings
    • Schema (storage/schema/) - Drizzle ORM schema definitions
    • Migrations - Managed with Drizzle ORM
    • Semantic Search (search/) - Vector similarity search with filtering

    🧑‍💻 Development

    # Install dependencies
    pnpm install
    
    # Run database migrations
    pnpm run db:migrate
    
    # Build the package
    pnpm build
    
    # Run in development mode with hot reload
    pnpm dev

    📄 License

    This project is licensed under the MIT License.