Specifications

Kiln Specs combine evals, synthetic data generation, and Kiln copilot into one easy to use feature

Demo & Quick Start

circle-info

Note: Kiln Specs requires a Kiln Copilot account. Registration is free and easy inside the Kiln app.

What are Kiln Specs?

Kiln Specs combine 3 of Kiln's best features into an interactive tool: evals, synthetic data generation and copilot. Together they goes beyond making an eval manually in several ways:

  • Identify Gaps with AI: Kiln will read your judge prompt and help refine it. We detect underspecified aspects of your judge, conflicts with your task definition, ambiguous aspects that Judges may struggle with, and other common issues. It then works with you to close gaps and refine conflicts.

  • Interactive Human Alignment & Accuracy: Building a LLM-as-Judge as good as a human isn't easy. Human judges make subtle and subjective decisions, and have a hard time articulating their judgement process in a way LLMs can duplicate. Our alignment loop finds tough edge cases, compares LLM judge to human preference, and works with you iteratively until your judge is aligned to your preference.

  • Automatic Synthetic Data: Build robust synthetic dataset generator as you work. By the time you save your spec you’ll have large and accurate datasets for evals and training.

  • Judge Meta-prompting: Human’s often struggle at writing effective eval judge prompts. Our judge meta-prompting takes you issues and concerns in human terms, and turns them into accurate and judge-able evals.

  • Easy to Use: Subject matter experts can easily create accurate evals, without a lengthy iteration loop with data scientists. Kiln's copilot will walk you through all the steps of defining your judge, creating synthetic data, golden dataset, aligning your judge, and creating training datasets. You get the same rigorous process, without managing each step.

  • Fast: creating a Spec can be done in as little as 5 minutes, compared to over 30 minutes for an eval.

How to Get Started

Creating a Spec is easy:

  • Open the Kiln App to any task

  • Click "Specs & Evals" in the sidebar

  • Click "Create Spec"

  • Connect Kiln Copilot account (if you haven't already)

  • Follow the interactive steps until complete!

See the video above for a complete walkthrough.

Last updated