CoVe: Verify Factual Claims One By One
The travel assistant may write a smooth sentence:
West Lake will be sunny tomorrow afternoon, and the tea museum closes at 9 PM.
That sentence contains factual claims. Asking the same model to "check again" is weak. It may read its own answer and keep believing it.
CoVe is stricter: draft, extract verifiable claims, verify each claim, then revise from the verification results.
One Sentence
CoVe turns post-hoc confidence into claim extraction, independent verification, and revision, so factual errors can be caught.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| Fluent answer | Sounds true | Facts may be wrong |
| Vague review | "Checked" | No evidence artifact |
| Citation drift | Has citations | Citation may not support the claim |
What This Pattern Changes
| Who | Owns |
|---|---|
| Draft model | Writes the initial answer |
| Claim extractor | Splits answer into verifiable claims |
| Verifier | Returns ok and evidence for each claim |
| Revising model | Removes or rewrites failed claims |
Verification should produce evidence, not vibes.
Walk Through One Trace
| Stage | Content | Result |
|---|---|---|
| Draft | Paris is the capital of France. Paris has 3 moons. |
One true, one false |
| Extract | ["Paris is the capital of France", "Paris has 3 moons"] |
Claim list |
| Verify | capital true, 3 moons unsupported | Failure marked |
| Revise | Paris is the capital of France. |
Bad claim removed |
Flow
flowchart TD
Q["Question"] --> D["Draft"]
D --> C["Extract claims"]
C --> V["Verify each claim"]
V --> R["Revise from evidence"]
R --> O["Final answer"]
Code Walk
The example verifier is deterministic:
def verify(claim: str) -> ClaimVerification:
if "capital of France" in claim:
return ClaimVerification(claim=claim, ok=True, evidence="widely known")
if "3 moons" in claim:
return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
return ClaimVerification(claim=claim, ok=False, evidence="unknown")
Full example:
from __future__ import annotations
from pathlib import Path
from agent_patterns_lab.patterns.cove import ClaimVerification, chain_of_verification
from agent_patterns_lab.runtime import MockLLM, Tracer
def main() -> None:
tracer = Tracer()
model = MockLLM(
[
"Paris is the capital of France. Paris has 3 moons.",
'{"claims":["Paris is the capital of France","Paris has 3 moons"]}',
"Paris is the capital of France.",
]
)
def verify(claim: str) -> ClaimVerification:
if "capital of France" in claim:
return ClaimVerification(claim=claim, ok=True, evidence="widely known")
if "3 moons" in claim:
return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
return ClaimVerification(claim=claim, ok=False, evidence="unknown")
out = chain_of_verification(model, question="Tell me about Paris.", verify_claim=verify, tracer=tracer)
print(out)
trace_path = tracer.export_jsonl(Path(".traces") / "32_cove.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/32_cove.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| Maker-Checker | Checker reviews whole output | Style, structure, completeness |
| Voting | Candidates vote | Short answers and variance |
| CoVe | Claims are verified | Many factual claims |
| Agentic RAG | Agent searches and builds evidence | Evidence must be gathered over turns |
When To Use It
- Output contains multiple factual claims.
- You can verify claims with tools, retrieval, rules, or humans.
- False facts are costly.
- You need a claim-to-evidence mapping.
When Not To Use It
- There is no verification source.
- The task is creative writing.
- The output is short and low-risk.
- You need to gather evidence first; that is closer to Agentic RAG.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Missing claims | Bad assertion is never checked | Force atomic claim extraction |
| Weak evidence | "Widely known" everywhere | Store doc ids, snippets, calculations |
| Easy-only checks | Risky claim skipped | Verify high-risk claims first |
| High cost | Long article has many claims | Verify key claims or batch by section |
What To Read Next
CoVe fits when the answer exists and factual claims need checking.
If the agent must search until evidence is enough, read Agentic RAG. If you need overall quality feedback, read Maker-Checker.