Keep your Agents on Spec with Kiln
Kiln has two powerful features to ensure your AI systems perform as expected, drive optimizations and don't regress in quality:
Evaluations: powerful AI evals, made easy.
Specifications [Docs Coming Soon]: Work with Kiln's Copilot to build optimal specifications for agent behaviour, including automatically generating optimal evals/judges.
Evals 101: build your first eval start to finish
Evaluate RAG Accuracy: Kiln can generate custom Q&A evals which test your RAG with knowledge from your documents
Evaluate Tool Use: ensure your agents are using the right tools, at the right time, with the right parameters with tool use evals
Last updated 1 day ago