Terminal Digit (Table)
Checks whether the last digits of a table's numbers are evenly spread. In genuine measured data the final significant digit is essentially random, so each digit appears about a tenth of the time; people inventing numbers favour round values ending in 0 or 5. Exact-zero cells are excluded so legitimate structural zeros do not create a false pattern.
Technical description
Gathers all numeric cell values, drops exact zeros, takes the last significant digit of each (ignoring trailing decimal zeros), and applies a chi-square test of uniformity over digits 0-9 plus the proportion of digits that are 0 or 5 (20 percent under uniformity). Scoring: chi-square p below 0.01 with a 0/5 preference above 30 percent scores 4.5; a significant chi-square alone scores 3.5; a marginal chi-square (p 0.01-0.05) or slight preference (25-30 percent) scores 2.0; otherwise 0. At least 30 non-zero values are required.
How it works
Layer 1 (deterministic): collects numeric values, removes exact zeros, extracts each value's last significant digit, runs a chi-square uniformity test, and measures the 0/5 preference. A significant non-uniformity and an excess 0/5 preference combine to set the score (up to 4.5); a marginal result scores 2.0. Findings describe the non-uniformity and any digit preference.
Why this matters
Terminal-digit uniformity is one of the oldest tests of data authenticity because people cannot imitate it: experiments show fabricated numbers fall into characteristic digit preferences, especially round values ending in 0 or 5. The last digit of a real measurement is noise at the limit of precision and is therefore uniform across magnitudes and units, so a non-uniform or 0/5-heaped distribution flags numbers that were chosen rather than measured. Excluding structural zeros keeps the test honest on sparse count tables.
Score thresholds
- 0-1
- Terminal digits are uniformly distributed, as expected of measured data
- 2-3
- A marginal non-uniformity or a slight preference for round digits
- 4-5
- Strongly non-uniform terminal digits or a strong preference for 0 and 5, consistent with fabricated or heaped data
Limitations
Terminal-digit analysis assumes the recorded precision is fine enough that the last digit is noise; coarsely rounded data or values to few significant figures have non-random last digits even when genuine, so rounding can mimic fabrication. The chi-square test needs enough values, so small tables, or sparse tables after zeros are excluded, are skipped. It depends on OCR. Instruments reporting to the nearest 5 or 10, and unit conversions, produce legitimate digit preference. The chart-read version is indicator G11 and first-digit Benford analysis is a separate screen; T6 stays on the last digits of table numbers.
References
- Mosimann JE, Wiseman CV, Edelman RE. (1995). Data fabrication: Can people generate random digits?. Accountability in Research 4(1):31-55
- Preece DA. (1981). Distributions of Final Digits in Data. The Statistician 30(1):31-60
- Carlisle JB. (2017). Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia 72(8):944-952