list-checkEvals & Specs

Keep your Agents on Spec with Kiln

Overview

Kiln has two powerful features to ensure your AI systems perform as expected, drive optimizations and don't regress in quality:

  • Evaluations: powerful AI evals, made easy.

  • Specifications [Docs Coming Soon]: Work with Kiln's Copilot to build optimal specifications for agent behaviour, including automatically generating optimal evals/judges.

Guides

  • Evals 101: build your first eval start to finish

  • Evaluate RAG Accuracy: Kiln can generate custom Q&A evals which test your RAG with knowledge from your documents

  • Evaluate Tool Use: ensure your agents are using the right tools, at the right time, with the right parameters with tool use evals

Last updated