ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
G3Text analysisHallucinationLayer 1 (Deterministic)

Fact Hallucination

Flags confident assertions made without visible support: guidance attributed to a named body or to vague authorities with no citation, cause-and-effect claims stated without evidence, misused technical terms, sweeping unattributed claims, and absolute, overstated wording. It works from the text alone, in English and Romanian.

Technical description

G3 detects the confident, unsupported assertion that a model produces when it fills a gap with something authoritative. It is a screen for internal support: it marks claims by their linguistic form and the absence of a nearby citation, leaving the question of external truth to the fact-checking and citation indicators. It runs on text of at least 200 words, splits the text into sentences, and sums five checks into a single 0 to 5 score (capped at 5.0). A claim counts as supported when a citation (Author, Year) or [N] appears in its sentence or either neighbour.

How it works

The implementation is deterministic and runs at Layer 1 over compiled regular expressions, with English and Romanian pattern sets selected by the document language.

Authority attribution (sub-check 1). Statements that attribute guidance to a named body, such as the WHO recommends or the FDA advises, drawn from a list that includes the major health and science organisations, and vaguer appeals such as experts agree or studies confirm where the authority is invoked but never named, are checked for a supporting citation in their sentence window. Each unsupported attribution adds 0.5, capped at 2.0.

Causal claims (sub-check 2). Cause-and-effect statements, X causes Y, X leads to Y, and their Romanian equivalents (determină, duce la, declanșează), are checked for nearby evidence. Each unsupported causal claim adds 0.3, capped at 1.5.

Terminology misuse (sub-check 3). A set of patterns catches confidently misapplied vocabulary, including the slide from correlation to causation, the non-statistical use of significant, and a prevalence-in-a-cohort-study mismatch. Each match adds 0.3, capped at 1.0.

Strong unattributed claims (sub-check 4). Sweeping openers such as it is well established that and studies have shown, in English and Romanian (se știe că, cercetările au arătat), carrying no attribution, add 0.3 each, capped at 1.5.

Absolute wording (sub-check 4b). Unqualified, sweeping language such as completely eliminates, 100% effective, beyond any doubt, and their Romanian equivalents, appearing without support, adds 0.3 each, capped at 1.0. Genuine scientific writing is hedged; absolute phrasing without backing is a marker of overclaiming.

Aggregation. The five contributions are summed and reported as min(5.0, total). The metadata returns the institutional, causal, terminology-misuse, strong-claim and overstated-claim counts, with the unsupported subset of each.

Score thresholds

Score Meaning
0 to 1 Strong claims are attributed and supported; institutional guidance is cited and wording is appropriately hedged.
2 to 3 A few unsupported attributions or causal claims, a misused term, or some overstated wording.
4 to 5 Several confident, sweeping, institutional or absolute claims made with no support at all.

Why this matters

The defining failure of language models is fluent, confident invention: grammatically and rhetorically flawless text that asserts things that are simply untrue. Recent reviews separate two problems, faithfulness to a given source and factuality against the wider world, and G3 works on the first by asking whether a strong claim carries the support it implies. In a manuscript the most dangerous hallucinations are not obvious nonsense but plausible, authoritative-sounding claims: a guideline attributed to a real institution that never issued it, an appeal to unnamed experts, a causal mechanism stated as settled fact, a sweeping summary of a literature that was never read. Misattribution of this kind is now a documented mode of failure in AI-assisted work. The danger is compounded by how readers process language: people infer causation from merely correlational wording even when the text is careful, so a model that states cause and effect outright is believed all the more readily. The same pull runs through overstatement; scientific writing has historically leaned on hedging, and the practice of inflating findings beyond what the results justify, long studied under the name spin, is what confident generated prose amplifies. These claims pass casual reading precisely because they are well written. G3 targets the structural giveaway they share, strength of claim without strength of support, so reviewers can focus on the assertions most likely to be invented.

Limitations

G3 reads form, not truth: it detects that a strong claim was made without visible support, so a correct, well-known fact stated without a citation can be flagged, and a false claim dressed in a citation can pass. The institutional and terminology checks rely on fixed lists and will miss bodies or terms outside them, while substring matching can occasionally over-fire. The authority, overstatement and strong-claim signals need enough running text to be meaningful, so very short passages pass quietly. The patterns cover English and Romanian; other languages reduce coverage. G3 is a prompt to verify the flagged claims, and the verdict on whether a claim is true rests with the fact-checking and citation indicators.

Theoretical background

G3 rests on three lines of work. The first is the faithfulness-versus-factuality distinction from the hallucination literature, which separates whether a text is supported by its source from whether it is true of the world; G3 measures the former at the surface. The second is the psycholinguistics of stance and inference: appeals to authority and boosters are the devices by which writers claim certainty, and readers infer causation from correlational statements even against explicit hedging, so unsupported authority and bare causal claims carry outsized persuasive weight. The third is the spin literature in biomedical reporting, which classifies the ways authors inflate findings beyond what results justify; absolute, unhedged wording is the surface form of that inflation. G3 turns each of these into a deterministic, form-based check, leaving the truth verdict to the indicators that compare claims against external sources.

References

  1. Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, Ishii E, Bang Y, Madotto A, Fung P. Survey of hallucination in natural language generation. ACM Computing Surveys. 2023;55(12):1-38. DOI: 10.1145/3571730
  2. Rahman SS, Islam MA, Alam MM, Zeba M, Rahman MA, Chowa SS, Raiaan MAK, Azam S. Hallucination to truth: a review of fact-checking and factuality evaluation in large language models. arXiv preprint arXiv:2508.03860. 2025. https://arxiv.org/abs/2508.03860
  3. Kim H, Yu H, Yi H. The LLM fallacy: misattribution in AI-assisted cognitive workflows. arXiv preprint arXiv:2604.14807. 2026. https://arxiv.org/abs/2604.14807
  4. Gershman SJ, Ullman TD. Causal implicatures from correlational statements. PLoS One. 2023;18(5):e0286067. DOI: 10.1371/journal.pone.0286067
  5. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA. 2010;303(20):2058-2064. DOI: 10.1001/jama.2010.651