ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
S8Statistical analysisStatistical ConsistencyLayer 1 (Deterministic)

Terminal Digit (Stats)

Looks at the last significant digit of every number reported in an article and tests whether those digits are spread evenly across 0 through 9, as genuine measured data should be. People who invent numbers favour certain digits, especially 0 and 5, so a lopsided distribution of last digits is a sign that the data may have been made up or heavily rounded by hand. The indicator runs a chi-squared uniformity test and separately measures how often the last digit is 0 or 5.

Technical description

A deterministic forensic test on the terminal digits of the numbers extracted from the article. The terminal digit of a genuine measurement is the least predictable part of the value, so across measured data the last digits should be near-uniform over 0 to 9; fabricated or hand-massaged data instead show digit preference, most often 0 and 5. Because the extractor pools every standalone number, S8 first removes the classes that are not free-terminal-digit measurements: exact zeros (a structural zero carries no chosen terminal digit), four-digit-year integers from 1900 to 2100, sub-1 values (p-values, alpha levels, proportions, correlations, whose terminal digits are legitimately non-uniform), and percentages (a number immediately followed by a percent sign). It then requires at least thirty values, runs a Pearson chi-squared goodness-of-fit test of the ten last-digit counts against the uniform expectation (one tenth each, nine degrees of freedom), and computes the proportion of terminal digits equal to 0 or 5. The test runs only on documents classified as articles; otherwise a neutral score, and with no statistical context the neutral no-data result.

How it works

Layer 1 (deterministic): the reported numbers are filtered to plausible measurements by removing exact zeros, four-digit years, sub-1 probabilities and proportions, and percentages (detected from context when number positions are available). The last significant digit of each surviving value is taken after stripping fractional trailing zeros (3.45 gives 5, 12.0 gives 2, 100 gives 0). With at least thirty values, the ten digit counts are compared against equal expected counts by Pearson's chi-squared, giving a uniformity p-value, and the 0-or-5 proportion is computed in parallel. Score: p>0.05 gives 0; 0.01<p<=0.05 gives 2.0 (mild); p<=0.01 gives 3.5 (strong), rising to 4.5 when the 0-or-5 proportion exceeds 0.30. Metadata records numbers_analyzed, chi2, p_value, and preference_05.

Why this matters

The terminal digit is the part of a measurement that should behave like a lottery draw, so its distribution is one of the oldest and most robust tests for invented data. Mosimann and colleagues showed experimentally that people asked to write random digits cannot do so, systematically favouring some digits, and argued that suspect data should be checked for this non-randomness; their later work formalised the examination of terminal digits in questioned datasets. The same reasoning underpins forensic re-analysis of clinical trials. Because the terminal digit is nearly free of legitimate scientific structure, a clear preference points to a human hand rather than an instrument, and separating mere non-uniformity from the specific 0-and-5 preference keeps the most diagnostic signature visible.

Score thresholds

0
Terminal digits are consistent with a uniform distribution.
2
Mild non-uniformity, with a uniformity p-value between 0.01 and 0.05.
3
Strong non-uniformity, with a uniformity p-value at or below 0.01.
4-5
Strong non-uniformity together with a marked preference for the digits 0 and 5.

Limitations

Needs enough data: at least thirty numbers, and is most reliable with fifty or more, where each digit's expected count clears the small-count regime in which the chi-squared approximation weakens. The filter removes exact zeros, four-digit years, sub-1 probabilities and proportions, and percentages, but other non-measurement values such as counts, sample sizes, degrees of freedom, and test statistics can remain and create non-uniformity that is not fabrication, so the result is a screening signal. Excluding all sub-1 values also drops the occasional genuine measurement below one, a deliberate trade favouring fewer false positives. Heaping on 0 and 5 also arises from honest coarse rounding or a known observer or instrument bias, so a high score flags a pattern to investigate. Runs only on documents classified as articles. The Benford first-digit test on the same numbers is S9, and the terminal-digit test on individual-patient data is D34.

References

  1. Mosimann JE, Wiseman CV, Edelman RE. (1995). Data fabrication: Can people generate random digits?. Accountability in Research 4(1):31-55
  2. Mosimann JE, Dahlberg JE, Davidian NM, Krueger JW. (2002). Terminal digits and the examination of questioned data. Accountability in Research 9(2):75-92
  3. Minhas J, Baird G, Appleby D, et al.. (2022). Terminal Digit Preference in Pulmonary Hypertension Endpoints. American Journal of Respiratory and Critical Care Medicine 205(12):1482-1485
  4. Crone G, Green CD. (2025). Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology 35(3):359-380
  5. Carlisle JB. (2017). Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia 72(8):944-952
  6. Bordewijk EM, Li W, van Eekelen R, Wang R, Showell M, Mol BW, van Wely M. (2021). Methods to assess research misconduct in health-related research: A scoping review. Journal of Clinical Epidemiology 136:189-202
  7. Hunter KE, Aberoumand M, Libesman S, et al.. (2024). The Individual Participant Data Integrity Tool for assessing the integrity of randomised trials. Research Synthesis Methods 15(6):917-939
  8. Wilkinson J, Heal C, Antoniou GA, et al.. (2024). A survey of experts to identify methods to detect problematic studies: stage 1 of the INveStigating ProblEmatic Clinical Trials in Systematic Reviews project. Journal of Clinical Epidemiology 175:111512