Found 5 results for ai-evaluation

@wundr.io/agent-eval

Agent evaluation framework with LLM-based grading for AI agent quality assessment

@codervisor/devlog-ai

AI Chat History Extractor & Docker-based Automation - TypeScript implementation for GitHub Copilot and other AI coding assistants with automated testing capabilities

@hmodecode/multimind-mcpserver

tool to get the best results out of an LLM!

tkyodrift

Lightweight CLI tool and library for detecting AI model drift using embeddings and scalar metrics. Tracks semantic, conceptual, and lexical change over time.

@handit.ai/ai-wrapper

🤖 Intelligent AI execution system with built-in tracking, evaluation, and self-improvement capabilities. The complete AI intelligence platform for enterprise applications.