JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 8
  • Score
    100M100P100Q47475F
  • License MIT

KindLM LLM Output Test Runner CLI

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@kindlm/cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    KindLM

    CI

    Behavioral regression testing for AI agents. Test what your agents do — not just what they say.

    Why KindLM?

    LLM evals measure text quality. KindLM tests behavior — the tool calls your agent makes, the decisions it takes, and whether it leaks PII or violates compliance rules. It runs in CI so regressions never ship.

    Features

    • Tool call assertions — verify agents call the right tools with the right arguments, in the right order
    • Schema validation — structured output checked against JSON Schema (AJV)
    • PII detection — catch leaked SSNs, credit cards, emails, phone numbers, IBANs
    • LLM-as-judge — score responses against natural-language criteria (0.0–1.0)
    • Drift detection — semantic + field-level comparison against saved baselines
    • Keyword guards — require or forbid specific phrases in output
    • Latency & cost budgets — fail tests that exceed time or token-cost thresholds
    • EU AI Act compliance — generate Annex IV documentation from test results
    • CI-native — exit code 0/1, JUnit XML reporter, GitHub Actions ready

    Supported Providers

    Provider Example config
    OpenAI openai:gpt-4o
    Anthropic anthropic:claude-sonnet-4-5-20250929
    Google Gemini google:gemini-2.0-flash
    Mistral mistral:mistral-large-latest
    Cohere cohere:command-r-plus
    Ollama ollama:llama3

    Quick Start

    Try it instantly:

    npx @kindlm/cli init

    Or install globally:

    npm install -g @kindlm/cli
    kindlm init

    Edit the generated kindlm.yaml:

    kindlm: 1
    project: "my-agent"
    
    suite:
      name: "refund-agent"
    
    providers:
      openai:
        apiKeyEnv: "OPENAI_API_KEY"
    
    models:
      - id: "gpt-4o"
        provider: "openai"
        model: "gpt-4o"
        params:
          temperature: 0
    
    prompts:
      refund:
        system: "You are a refund support agent. Use lookup_order(order_id) to find orders."
        user: "{{message}}"
    
    tests:
      - name: "looks-up-order"
        prompt: "refund"
        vars:
          message: "I want to return order #12345"
        tools:
          - name: "lookup_order"
            responses:
              - when: { order_id: "12345" }
                then: { order_id: "12345", status: "eligible" }
        expect:
          toolCalls:
            - tool: "lookup_order"
              argsMatch: { order_id: "12345" }
          guardrails:
            pii:
              enabled: true
          judge:
            - criteria: "Response is empathetic and professional"
              minScore: 0.8

    Run your tests:

    kindlm test

    Output:

      refund-agent / looks-up-order
    
      gpt-4o
        ✓ looks-up-order  (1.3s)
          ✓ tool_called: lookup_order
          ✓ pii: no PII detected
          ✓ judge: 0.92 ≥ 0.80
    
      1 passed, 0 failed
      Gates: ✓ PASSED

    CLI Flags

    Flag Description Added
    --reporter <type> Output format: pretty (default), json, junit v1.0.0
    --output <path> Write report to file v1.0.0
    --compliance Generate EU AI Act Annex IV report v1.0.0
    --pdf <path> Export compliance report as PDF (requires --compliance) v1.0.0
    -s <suite> Run a specific suite by name v1.0.0
    --runs <count> Override the repeat count from config v1.0.0
    --gate <percent> Fail if suite pass rate falls below threshold (0–100) v1.0.0
    --isolate Run in an isolated git worktree (clean environment) v2.0.0
    --concurrency <n> Override the concurrency setting from config v2.1.0
    --timeout <ms> Override the per-test timeout from config v2.1.0

    CI Integration

    # .github/workflows/test.yml
    - run: npm install -g @kindlm/cli
    - run: kindlm test --reporter junit --output results.xml

    Repository Layout

    packages/
      core/       @kindlm/core  — Business logic, zero I/O dependencies
      cli/        @kindlm/cli   — CLI entry point
      cloud/      @kindlm/cloud — Cloudflare Workers API + D1 database
    docs/         Technical specs and documentation
    site/         Documentation website (Next.js)

    Documentation

    Full docs: kindlm.dev | Source: docs/

    License

    MIT (core + CLI) | AGPL (cloud)