Citation Hallucination
Flags fabricated and malformed bibliographic references from the document text alone, using surface patterns rather than any external lookup: placeholder citations, in-text references missing from the bibliography, template titles, fake identifiers, and statistically implausible reference profiles.
Technical description
G1 detects fabricated references deterministically, with no database call, and is the Layer 1 partner of the external citation stack: it screens the manuscript for the internal signatures of invention, leaving existence and support verification to L3, G4 and the verification engine. It runs on text of at least 200 words, extracts the in-text citations (author-year and numeric) and the bibliography, and sums eight sub-checks into a single 0 to 5 score (capped at 5.0). Flagged references are labelled with a category from the five-part NeurIPS-2025 fabricated-citation taxonomy, with Placeholder Hallucination and Identifier corruption emitted directly into the fabrication_categories metadata.
How it works
The implementation is deterministic and runs at Layer 1 over compiled regular expressions. In-text citations are extracted in Harvard form, (Author et al., 2024) and (Author & Other, 2020), and numeric form, [1] or [1, 2, 3]; identifiers are matched as 10.xxxx/...; and the bibliography section, when present, is located and split into entries.
Placeholder citations (sub-check 1b). A dedicated pattern matches stub citations left in the text, such as [CITATION], [citation needed], [REF], [x] and bare (Author, Year) templates (the numeric form [1] is excluded). Each placeholder adds 0.5 up to a cap of 2.5, the count is recorded under the Placeholder Hallucination category, and the first five are reported individually. Because a document can consist of placeholders alone, this check runs before the no-citations early exit.
Citation-to-bibliography matching (sub-check 3). Each Harvard in-text citation is matched against the parsed bibliography entries. An in-text citation with no corresponding entry adds 0.3, capped at 2.0 across the document, since a reference cited in the body but absent from the list is a common mark of invented support.
Template titles (sub-check 4). Bibliography entries are tested for the generic, AI-template title shapes that recur in fabricated reference lists. Each suspicious title adds 0.5, capped at 1.5.
Identifier validation (sub-check 5). Document object identifiers are validated for form, and identifiers matching the fake-identifier pattern, such as the 10.0000 and 10.xxxx prefixes or placeholder suffixes, add 0.5 each up to a cap of 2.0 and are recorded under the Identifier category. Prefix-to-publisher consistency is also checked, contributing a further 0.3 on a mismatch.
Year-distribution analysis (sub-check 6). Publication years are extracted from the bibliography and filtered to the plausible range 1900 to the current year. When more than ten references are present, two profiles are flagged: a span narrower than five years between the oldest and newest reference adds 0.5 (consistent with a model drawing from a narrow memory window), and a review-style paper that cites nothing older than ten years adds 0.5.
Journal diversity (sub-check 7). Journal names are extracted from the entries; a bibliography of more than twenty references drawn from fewer than three distinct journals adds 0.5, reflecting the low source diversity typical of invented lists.
Author-count consistency (sub-check 8). For each Author et al. citation, the corresponding bibliography entry is checked; when the entry resolves to two or fewer distinct authors, the et al. is inconsistent and adds 0.3, capped at 0.9.
Aggregation. The sub-check contributions are summed and reported as min(5.0, total). The metadata returns the unmatched-citation, suspicious-title, placeholder and malformed-identifier counts, the year range, the journal count, and the fabrication_categories map.
Score thresholds
| Score | Meaning |
|---|---|
| 0 to 1 | Citations resolve to a bibliography, identifiers are well-formed, and the reference profile is diverse in year and venue. |
| 2 to 3 | Some unmatched citations, a template title or two, a narrow year range, or low journal diversity. |
| 4 to 5 | Several strong signatures together: placeholder citations, fake identifiers, many unmatched references, or a uniform invented-looking list. Strongly consistent with fabricated references. |
Why this matters
Fabricated references became a measurable feature of the literature once language models entered the writing pipeline. Audits of the published record put the rate of papers containing at least one fabricated reference at roughly one in 277 by early 2026, up sharply from one in 458 a year earlier, and a coded sample of one hundred fabricated citations drawn from accepted papers at a leading conference appeared across dozens of submissions that had passed expert review. Large-scale benchmarks report citation-hallucination rates from 14 to 95 percent across models and domains. The reason a deterministic Layer 1 screen matters is speed and placement: most of these failures leave a surface trace, a placeholder never filled, an in-text citation with no matching entry, an identifier of the wrong shape, a reference list too uniform in year or venue to be real, that can be caught instantly without a network call, concentrating the slower external verification on the references that actually need a database lookup. G1 supplies that screen and labels each problem with a fabrication category, so the downstream stack knows what kind of failure to confirm.
Limitations
G1 reasons from surface form, so it screens for the signatures of invention rather than confirming that any single reference is real; a well-formed, correctly matched citation can still point to a paper that does not exist, which is settled by the Layer 3 verification indicators. The citation-to-bibliography, journal-diversity and author-count checks require a parseable bibliography, so a manuscript submitted without its reference list, or with one in an unrecognised format, loses those signals. The year-range and journal-diversity heuristics are field-sensitive: a fast-moving subfield can legitimately cite a narrow window, and a specialist venue can legitimately dominate a reference list, so those checks contribute small amounts and are framed as profile signals rather than proof. The extraction patterns recognise Harvard and numeric citation styles and Latin-script bibliographies; other citation conventions reduce coverage. The thresholds were calibrated against 2024-2026 output and require periodic recalibration.
Theoretical background
G1 operationalises the five-category taxonomy of fabricated citations established by a 2026 forensic study of one hundred fabricated references: total fabrication, partial attribute corruption, identifier hijacking, placeholder hallucination and semantic hallucination. Two of these have unambiguous surface signatures and are detected directly here, placeholder hallucination through the stub-citation pattern and identifier corruption through the fake-identifier pattern, while the remaining categories require comparison against an external record and are routed to the verification stack. The division of labour, deterministic internal screening at Layer 1 and database verification at Layer 3, mirrors the design of contemporary citation-checking tools, which combine fast pattern filters with multi-source lookups against CrossRef, OpenAlex and Semantic Scholar, and which report that the bulk of fabricated references can be triaged on surface features before any record is fetched.
References
- Ansari MS. Compound deception in elite peer review: a failure mode taxonomy of 100 fabricated citations at NeurIPS 2025. arXiv preprint arXiv:2602.05930. 2026. https://arxiv.org/abs/2602.05930
- Abbonato D. CheckIfExist: detecting citation hallucinations in the era of AI-generated content. arXiv preprint arXiv:2602.15871. 2026. https://arxiv.org/abs/2602.15871
- Shi K, Sun W, Zhang Z, Sun L, Chawla NV, Ye Y. CiteAudit: you cited it, but did you read it? A benchmark for verifying scientific references in the LLM era. arXiv preprint arXiv:2602.23452. 2026. https://arxiv.org/abs/2602.23452
- Xu Z, et al. GhostCite: a large-scale analysis of citation validity in the age of large language models. arXiv preprint arXiv:2602.06718. 2026. https://arxiv.org/abs/2602.06718
- Rao D, Wong E, Callison-Burch C. Detecting and correcting reference hallucinations in commercial LLMs and deep research agents. arXiv preprint arXiv:2604.03173. 2026. https://arxiv.org/abs/2604.03173