Home/Newsletter/Evals Are the New Product Spec
Edition #12

Evals Are the New Product Spec

Dan Toma·June 16, 2026·4 min read
Key Takeaway

When the same prompt can give different answers, a fixed spec cannot define good. Evals do, by scoring real examples on every change. Build the eval first, then let the score pick the model. It is the new product spec.


FAQ

What is an eval in AI product development?

An eval is a set of test cases paired with a definition of a good answer, run automatically whenever you change a prompt, model, or data source. It scores whether the new version performs better or worse than the last. It replaces the fixed specification that does not work when a model output varies.

Why are evals replacing product requirement documents?

Because AI output is not deterministic, so you cannot specify exact behavior in advance. Instead of describing what the software must do, you define how you will judge whether it is good and test against real examples continuously. As one practitioner put it, evals are the modern version of a PRD.

How do I start building evals for an AI product?

Collect real examples of your product task and label each output as good or not, with a reason. Turn that library into automated test cases, then run every prompt or model change against it before shipping. Assign a person to own the definition of good, since that judgment is the core of the product.

Subscribe to The Weekly Vibe

Every Tuesday. 5-7 original takes on what matters in AI, Marketing, and Business Growth. No spam, no fluff, unsubscribe anytime.