10 releases
| 0.2.6 | Feb 20, 2026 |
|---|---|
| 0.2.5 | Feb 17, 2026 |
| 0.1.5 | Jan 27, 2026 |
#1021 in Testing
Used in serdes-ai
95KB
2.5K
SLoC
serdes-ai-evals
Evaluation framework for testing and benchmarking serdes-ai agents
This crate provides evaluation and testing capabilities for SerdesAI:
- Test case definitions
- Evaluation metrics (accuracy, latency, cost)
- Benchmark harness
- Regression testing
- LLM-as-judge evaluators
Installation
[dependencies]
serdes-ai-evals = "0.1"
Usage
use serdes_ai_evals::{EvalSuite, TestCase, Evaluator};
let suite = EvalSuite::new("my-agent-tests")
.case(TestCase::new("greeting")
.input("Hello!")
.expected_contains("Hello"))
.case(TestCase::new("math")
.input("What is 2+2?")
.expected_contains("4"));
let results = suite.run(&agent).await?;
println!("Pass rate: {:.1}%", results.pass_rate() * 100.0);
Part of SerdesAI
This crate is part of the SerdesAI workspace.
For most use cases, you should use the main serdes-ai crate which re-exports these types.
License
MIT License - see LICENSE for details.
Dependencies
~12–18MB
~243K SLoC