Skip to content

CoVe: Verify Factual Claims One By One

The travel assistant may write a smooth sentence:

West Lake will be sunny tomorrow afternoon, and the tea museum closes at 9 PM.

That sentence contains factual claims. Asking the same model to "check again" is weak. It may read its own answer and keep believing it.

CoVe is stricter: draft, extract verifiable claims, verify each claim, then revise from the verification results.

One Sentence

CoVe turns post-hoc confidence into claim extraction, independent verification, and revision, so factual errors can be caught.

What Breaks Without It

Problem What it looks like Risk
Fluent answer Sounds true Facts may be wrong
Vague review "Checked" No evidence artifact
Citation drift Has citations Citation may not support the claim

What This Pattern Changes

Who Owns
Draft model Writes the initial answer
Claim extractor Splits answer into verifiable claims
Verifier Returns ok and evidence for each claim
Revising model Removes or rewrites failed claims

Verification should produce evidence, not vibes.

Walk Through One Trace

Stage Content Result
Draft Paris is the capital of France. Paris has 3 moons. One true, one false
Extract ["Paris is the capital of France", "Paris has 3 moons"] Claim list
Verify capital true, 3 moons unsupported Failure marked
Revise Paris is the capital of France. Bad claim removed

Flow

flowchart TD
  Q["Question"] --> D["Draft"]
  D --> C["Extract claims"]
  C --> V["Verify each claim"]
  V --> R["Revise from evidence"]
  R --> O["Final answer"]

Code Walk

The example verifier is deterministic:

def verify(claim: str) -> ClaimVerification:
    if "capital of France" in claim:
        return ClaimVerification(claim=claim, ok=True, evidence="widely known")
    if "3 moons" in claim:
        return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
    return ClaimVerification(claim=claim, ok=False, evidence="unknown")

Full example:

from __future__ import annotations

from pathlib import Path

from agent_patterns_lab.patterns.cove import ClaimVerification, chain_of_verification
from agent_patterns_lab.runtime import MockLLM, Tracer


def main() -> None:
    tracer = Tracer()

    model = MockLLM(
        [
            "Paris is the capital of France. Paris has 3 moons.",
            '{"claims":["Paris is the capital of France","Paris has 3 moons"]}',
            "Paris is the capital of France.",
        ]
    )

    def verify(claim: str) -> ClaimVerification:
        if "capital of France" in claim:
            return ClaimVerification(claim=claim, ok=True, evidence="widely known")
        if "3 moons" in claim:
            return ClaimVerification(claim=claim, ok=False, evidence="unsupported")
        return ClaimVerification(claim=claim, ok=False, evidence="unknown")

    out = chain_of_verification(model, question="Tell me about Paris.", verify_claim=verify, tracer=tracer)
    print(out)

    trace_path = tracer.export_jsonl(Path(".traces") / "32_cove.jsonl")
    print(f"[trace] {trace_path}")


if __name__ == "__main__":
    main()

Run:

UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/32_cove.py

Nearby Patterns

Pattern Who decides next Use when
Maker-Checker Checker reviews whole output Style, structure, completeness
Voting Candidates vote Short answers and variance
CoVe Claims are verified Many factual claims
Agentic RAG Agent searches and builds evidence Evidence must be gathered over turns

When To Use It

  • Output contains multiple factual claims.
  • You can verify claims with tools, retrieval, rules, or humans.
  • False facts are costly.
  • You need a claim-to-evidence mapping.

When Not To Use It

  • There is no verification source.
  • The task is creative writing.
  • The output is short and low-risk.
  • You need to gather evidence first; that is closer to Agentic RAG.

Costs And Common Failures

Failure Symptom Fix
Missing claims Bad assertion is never checked Force atomic claim extraction
Weak evidence "Widely known" everywhere Store doc ids, snippets, calculations
Easy-only checks Risky claim skipped Verify high-risk claims first
High cost Long article has many claims Verify key claims or batch by section

CoVe fits when the answer exists and factual claims need checking.

If the agent must search until evidence is enough, read Agentic RAG. If you need overall quality feedback, read Maker-Checker.

References