CoVe: Verify Factual Claims One By One

The travel assistant may write a smooth sentence:

West Lake will be sunny tomorrow afternoon, and the tea museum closes at 9 PM.

That sentence contains factual claims. Asking the same model to "check again" is weak. It may read its own answer and keep believing it.

CoVe is stricter: draft, extract verifiable claims, verify each claim, then revise from the verification results.

One Sentence

CoVe turns post-hoc confidence into claim extraction, independent verification, and revision, so factual errors can be caught.

What Breaks Without It

Problem	What it looks like	Risk
Fluent answer	Sounds true	Facts may be wrong
Vague review	"Checked"	No evidence artifact
Citation drift	Has citations	Citation may not support the claim

What This Pattern Changes

Who	Owns
Draft model	Writes the initial answer
Claim extractor	Splits answer into verifiable claims
Verifier	Returns `ok` and evidence for each claim
Revising model	Removes or rewrites failed claims

Verification should produce evidence, not vibes.

Walk Through One Trace

Stage	Content	Result
Draft	`Paris is the capital of France. Paris has 3 moons.`	One true, one false
Extract	`["Paris is the capital of France", "Paris has 3 moons"]`	Claim list
Verify	capital true, 3 moons unsupported	Failure marked
Revise	`Paris is the capital of France.`	Bad claim removed

Flow

flowchart TD
  Q["Question"] --> D["Draft"]
  D --> C["Extract claims"]
  C --> V["Verify each claim"]
  V --> R["Revise from evidence"]
  R --> O["Final answer"]

Code Walk

The example verifier is deterministic:

def verify(claim: str) -> ClaimVerification:
    if "capital of France" in claim:
        return ClaimVerification(claim=claim, ok=True, evidence="widely known")
    if "3 moons" in claim:
        return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
    return ClaimVerification(claim=claim, ok=False, evidence="unknown")

Full example:

from __future__ import annotations

from pathlib import Path

from agent_patterns_lab.patterns.cove import ClaimVerification, chain_of_verification
from agent_patterns_lab.runtime import MockLLM, Tracer


def main() -> None:
    tracer = Tracer()

    model = MockLLM(
        [
            "Paris is the capital of France. Paris has 3 moons.",
            '{"claims":["Paris is the capital of France","Paris has 3 moons"]}',
            "Paris is the capital of France.",
        ]
    )

    def verify(claim: str) -> ClaimVerification:
        if "capital of France" in claim:
            return ClaimVerification(claim=claim, ok=True, evidence="widely known")
        if "3 moons" in claim:
            return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
        return ClaimVerification(claim=claim, ok=False, evidence="unknown")

    out = chain_of_verification(model, question="Tell me about Paris.", verify_claim=verify, tracer=tracer)
    print(out)

    trace_path = tracer.export_jsonl(Path(".traces") / "32_cove.jsonl")
    print(f"[trace] {trace_path}")


if __name__ == "__main__":
    main()

Run:

UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/32_cove.py

Nearby Patterns

Pattern	Who decides next	Use when
Maker-Checker	Checker reviews whole output	Style, structure, completeness
Voting	Candidates vote	Short answers and variance
CoVe	Claims are verified	Many factual claims
Agentic RAG	Agent searches and builds evidence	Evidence must be gathered over turns

When To Use It

Output contains multiple factual claims.
You can verify claims with tools, retrieval, rules, or humans.
False facts are costly.
You need a claim-to-evidence mapping.

When Not To Use It

There is no verification source.
The task is creative writing.
The output is short and low-risk.
You need to gather evidence first; that is closer to Agentic RAG.

Costs And Common Failures

Failure	Symptom	Fix
Missing claims	Bad assertion is never checked	Force atomic claim extraction
Weak evidence	"Widely known" everywhere	Store doc ids, snippets, calculations
Easy-only checks	Risky claim skipped	Verify high-risk claims first
High cost	Long article has many claims	Verify key claims or batch by section

What To Read Next

CoVe fits when the answer exists and factual claims need checking.

If the agent must search until evidence is enough, read Agentic RAG. If you need overall quality feedback, read Maker-Checker.