AI Accuracy & Human Review

We use AI to extract and synthesise findings from your research, but AI can make mistakes. This page explains the limitations, what we do to reduce errors, and how you can help.

What AI gets wrong

Large language models are powerful but not infallible. Common failure modes include:

Misreading context or tone — a cautious statement may be extracted as a confident claim.
Over-generalising — drawing a broad conclusion from a narrow data point in one source.
Omission — missing a nuanced or implicit finding that a human reader would catch.
Hallucination — in rare cases, generating plausible-sounding content that is not grounded in the source material.

We cannot guarantee that every AI-generated finding is accurate. Critical decisions should always be validated against the original source material.

How we try to reduce errors

Source grounding. Every finding is extracted directly from an artifact you provide. The model is instructed to cite only what is present in the source, not to infer beyond it.
Reasoning traces. Findings include an optional reasoning field that shows why the model extracted a particular point, making it easier to spot flawed logic.
Line references. Where possible, findings are pinned to specific lines in the source document so you can jump directly to the evidence and verify it.
Framework-aware prompting. Ingestion prompts are tailored to your selected strategic framework, reducing the chance of irrelevant or misclassified findings.
Human review weighting. During synthesis, findings marked as human-reviewed are explicitly prioritised. The more vetted evidence you build up, the more reliable your strategic outputs become.
Triple-pass synthesis (Pro). When enabled in Settings → AI Models, ingestion and synthesis run a three-step pipeline: an initial draft (ingestion model), a review pass that refines the summary for accuracy (review model), and verification of extracted findings against the source where applicable. Each extra pass uses additional tokens, so triple-pass work typically costs about three times the tokens of single-pass generation for the same content — with corresponding impact on AI credits. The sparkles + "3×" badge on summaries and findings indicates this path was used.

What you can do

Mark findings as reviewed. Use the Mark as reviewed action on any finding card to flag that a human has verified it. This raises its weight in synthesis and signals confidence to collaborators.
Edit findings you disagree with. Editing a finding inline automatically marks it as reviewed and preserves your corrected version across re-ingestion.
Delete findings that are wrong. Removing an incorrect finding keeps your evidence base clean and prevents it from influencing future syntheses. On Pro plans, deletions are also recorded as negative signals — the AI will avoid surfacing similar content in future runs. See Adaptive AI.
Check line references. Where a line reference is shown, click it to read the source text and confirm the extraction is accurate before acting on it.