Search & Learn
Resolve questions through a persisted evidence loop, not a plausible narrative
When to use
- Asking questions about unfamiliar codebases
- Exploring APIs and trying to understand how things work
- Q&A where the answer influences a decision
The workflow
- Start or load a run. All evidence must be stored as real spans.
- Pick one unresolved claim. Choose the claim that most affects the answer.
- Gather the smallest next evidence. Inspect one file, run one command, fetch one doc. Store the output as a span.
- Record an attempt. Log what you tried, what spans were used, and what to do next.
- Audit the claim. Run
audit_trace_budgeton the affected claims. If flagged, gather more evidence or downgrade. - Verify the final answer. Run
detect_hallucinationon the cited answer draft. If flagged, revise. - Repeat until answer-critical claims are supported or explicitly marked Unknown.
Evidence pack
Strawberry does verification, not retrieval. Evidence must be collected by the agent (repo browsing, web search, experiments) or pasted by the user.
Example span types:
- S0: README excerpt describing the API contract
- S1: Code excerpt showing the implementation
- S2: Web excerpt from official docs (if used)
- S3: Experiment output (test run / curl / repro)
Verifier settings
detect_hallucination(
answer="Your answer with [S#] citations...",
spans=[...],
require_citations=true,
context_mode="cited"
)Copy/paste prompt
Answer using **only** evidence in S0–S2.
Every factual sentence must end with citations like [S0].
If something is unknown, say "unknown from evidence."
Then run detect_hallucination(require_citations=true, context_mode="cited") and revise if flagged.What good looks like
- Every factual claim has a citation
- Unknown facts are explicitly labeled
- The answer stops at what the evidence supports
- Gaps are identified with suggestions for additional evidence