Objective Optimization
Systematic experiment loop for any well-defined objective
When to use
- Lowering a benchmark metric or increasing a pass rate
- Reducing latency or cost
- Improving answer quality against a fixed evaluator
- Satisfying an acceptance test while keeping guardrails intact
The loop
- Define the objective card. Objective, direction, evaluator, guardrails, allowed levers, stop rule.
- Measure baseline. Run the evaluator and store the result as a span.
- Pick one bottleneck or hypothesis.
- Run the smallest experiment. Note git state, apply the change, measure.
- Record an attempt. Include objective metric, value, and keep/revert decision.
- Audit. Run
audit_trace_budgeton the objective claims. - Keep or revert. If it improves the objective without breaking guardrails, keep. Otherwise revert.
- Repeat until the stop rule is satisfied.
Verified claims
- OBJECTIVE_DEFINED — the objective and evaluator are correctly specified
- BASELINE_MEASURED — the current state has been measured and stored
- EXPERIMENT_MECHANISM — each attempt says what changed and why
- CURRENT_BEST_RESULT — the best retained attempt actually improves the objective
- GUARDRAILS_OK — regressions were checked