Workflow Playbooks

Verification-first workflows for common AI-assisted coding tasks

These playbooks show how to use Berry as a verification step to catch when a model "vibed" instead of using real evidence. Each playbook is a tight sequence: collect evidence, write with citations, run the verifier, revise.

Two tools, five workflows. Pick the one that matches what you're doing.

The workflows are designed around detect_hallucination andaudit_trace_budget, so any agentic system can plug them in as a verification layer. Built around Berry's two verification tools, these playbooks follow an evidence gate: collect evidence, write with citations, run the verifier, revise.

The two verification tools

detect_hallucination

Verifies a cited answer sentence-by-sentence. Use for Q&A, documentation, and any task where the output is text with factual claims.

Used in: Search & Learn, Greenfield Prototyping

Read the deep dive

claude-code
> Run detect_hallucination on this statement: "Give me information about Project Atlas."

claude-code: ... gathering evidence and framing the claim

• Called
  â”” hallucination-detector.detect_hallucination({
        "answer":"Project Atlas launched in 2020 and uses the Apollo API for
        analytics...",
        ...
      })
    {"flagged": true}
claude-code: ... verifier returned flagged result, the first sentence is supported; the second is contradicted by the source, so it’s flagged as a hallucination. If you want, I can run the same demo with multiple spans or mixed citations per claim. then it says "The model needs to revise its claim or you need to provide or let me gather more evidence"}

audit_trace_budget

Verifies a structured trace of reasoning steps—each step is a claim plus citations. Use when you want to catch "almost right" reasoning before it becomes a confident patch.

Used in: Refactoring, Boilerplate, Inline

Read the deep dive

claude-code
> "Build me a tool that helps my users calculate their tax."

claude-code: ... "easy, let me "Use Next.js with TypeScript and Prisma for the frontend and API layer..." wait before I start let me run hallucination detection"

• Called
  â”” hallucination-detector.audit_trace_budget({
        "steps":[
          {"idx":0,"claim":"Use Next.js with TypeScript and Prisma..."}
        ]
      })
    {"flagged": true}

claude-code: ... The user didn’t actually specify what stack they asked for, which is why my plan lacks the evidence budget to continue. Better ask for more context.

Workflow prompts in practice

These playbooks are written as skills with max-contrast (vibes vs evidence) examples, aimed at exploration, RCA, and greenfield planning.

Search & Learn

"Explain how this repo handles auth end-to-end."

Refactoring & Bug Fixes

"Root cause why the like button broke between v1.0.0 and v1.0.1."

Greenfield Prototyping

"Make me a website."

Plan & Execute (Verified)

"Vibe code me a new CV rewriting tool that hacks the ATS system."

Search & Learn

detect_hallucination

Q&A, repo exploration, API understanding. When you're asking questions or trying to understand unfamiliar code.

Refactoring & Bug Fixes

audit_trace_budget

RCA-gated changes with an audited claim trace. Force a high-signal writeup backed by evidence before merging.

Greenfield Prototyping

detect_hallucination

Move fast with Facts vs Decisions vs Assumptions. Don't let assumptions get smuggled in as facts.

Generate Boilerplate

audit_trace_budget

Tests, docs, migrations, configs. Verify constraints and decisions before generating code.

Inline Completions

audit_trace_budget

Spot-check high-impact tab-complete with a 3-6 step micro-trace.

Client tips

These playbooks are written as skills. In practice, MCP clients vary in how strictly they execute the sequence.

  • Codex: Best adherence. Tends to follow the skill end-to-end without deviating.
  • Claude: Start in /plan mode and ask it to create a plan for the workflow. Then execute the plan step-by-step, requiring the verifier call before the final answer.
  • Other clients: May skip tool calls or drift. Pin the "copy/paste prompt" from each playbook as a system instruction.