@wundr.io/agent-eval
Agent evaluation framework with LLM-based grading for AI agent quality assessment
Found 5 results for ai-evaluation
Agent evaluation framework with LLM-based grading for AI agent quality assessment
AI Chat History Extractor & Docker-based Automation - TypeScript implementation for GitHub Copilot and other AI coding assistants with automated testing capabilities
tool to get the best results out of an LLM!
Lightweight CLI tool and library for detecting AI model drift using embeddings and scalar metrics. Tracks semantic, conceptual, and lexical change over time.
🤖 Intelligent AI execution system with built-in tracking, evaluation, and self-improvement capabilities. The complete AI intelligence platform for enterprise applications.