SanityHarness exists to provide high-signal, agent-agnostic evaluation for AI Agents. In a sea of noise and cherry-picked benchmarks, we strive for isolation, weight, and clarity.
Running comprehensive evals is expensive. We appreciate API credits (GPT-5, Opus, etc) or donations to sustain the harness.