ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
D22Statistical analysisFabrication DetectionLayer 1 (Deterministic)

Benford IPD (1st + 2nd Digit)

Applies Benford's Law to the raw participant-level numbers, checking both the first and the second significant digit. In data that spans a wide range, the leading digit is a 1 about thirty percent of the time and falls off logarithmically, and the second digit follows its own gentler version of the same law. Fabricated data often fails one or both, because invented or model-generated numbers do not inherit the logarithmic digit structure of real measurement. The indicator pools the individual-patient values, measures how far the first-digit and second-digit distributions stray from Benford's expectation, and scores accordingly. It runs only on articles.

Technical description

D22 is a deterministic Benford screen on the pooled numeric values of individual-patient data, testing the first and second significant digits. It applies only to documents classified as articles and requires the data after the no-data and article checks. It pools all numeric values, keeps those with absolute value at least ten so the leading digit is well defined, and requires at least one hundred qualifying values for full confidence or at least thirty with reduced confidence. It extracts the first significant digit and the second significant digit of each value and compares their distributions against the Benford expectations: for the first digit the probability of digit d is the base-ten logarithm of one plus one over d, and for the second digit it is the corresponding sum over the leading digits. It computes the mean absolute deviation of the first-digit distribution from Benford and a normalised chi-squared for both digits, and scores from the first-digit deviation with an added penalty when the second digit also departs, since failing both digits at once is far less likely under honest data.

How it works

The pooled values at least ten in magnitude have their first and second significant digits extracted. The first-digit mean absolute deviation from the Benford proportions drives the base score through the established conformity bands, and the chi-squared statistics quantify the departure of each digit's distribution. When the second-digit distribution also deviates significantly, a bonus is added, because a generator that distorts the leading digit often distorts the next one too, and the joint failure is a stronger signal than either alone. A complementary mantissa arc test reads the whole significand rather than a single digit: each value's base-ten log mantissa is placed on the unit circle, and under Benford the mantissa is uniform so the mean resultant vector is near zero, the squared vector length L2 giving a Rayleigh tail probability of exp(minus n times L2) for circular uniformity (the mantissa arc test of Cinelli's benford.analysis [9]); when this whole-mantissa distribution is significantly non-uniform and the first-digit deviation is already in the non-conforming range, a small increment is added in corroboration. The available sample size modulates confidence, with fewer than one hundred but at least thirty values scoring more cautiously. The total is capped at 5.0. The metadata records the first-digit and second-digit deviation measures, a formal first-digit chi-squared goodness-of-fit p-value reported as a diagnostic since the over-powered chi-squared is not used for scoring, the mantissa arc statistic and its Rayleigh tail probability, and the number of values analysed.

Score thresholds

Score Meaning
0 to 1 First and second digits conform to Benford's Law.
2 to 3 A clear first-digit departure from Benford.
4 to 5 First and second digits both depart from Benford, a strong fabrication signal.

Why this matters

Benford's Law is the canonical fingerprint of naturally generated numbers spanning several orders of magnitude, first catalogued by Benford across diverse natural data [1]. Its forensic power, and its limits, were sharpened by Diekmann, who showed that while the first digit of fabricated scientific data can deviate from Benford, the test must be applied carefully and that examining digits beyond the first strengthens it [2]. Nigrini turned Benford conformity into a practical audit instrument with the mean-absolute-deviation bands this indicator uses [3]. Applying the test to raw individual-patient data, and to two digits at once, is more demanding than a first-digit check on summary numbers: a fabricator might tune the leading digits to look Benford-like, but reproducing the joint first-and-second-digit structure across a whole patient-level dataset is much harder, so a simultaneous failure of both digits is a particularly credible signal that the numbers were not measured. Benford's law has a rigorous statistical derivation as the unique scale-invariant significant-digit distribution [4], and recent forensic re-analyses, scoping reviews, and trustworthiness instruments place digit-distribution checks among the standard screens for fabricated and machine-generated data [5, 6, 7, 8].

Limitations

Benford's Law applies only to data spanning a wide multiplicative range, so pooling individual-patient values across columns, which usually mixes scales and widens the range, is what makes the test applicable, but a dataset dominated by one narrow-range variable may not conform even when genuine, and the indicator does not separately verify the magnitude span. It needs enough values, at least thirty and ideally one hundred, and keeps only values of magnitude ten or more, so small-valued variables contribute nothing. The second-digit test is weaker than the first and noisier in small samples. The conformity bands are Nigrini's and are directional. The test runs only on documents classified as articles, returning a neutral skip otherwise. The first-digit Benford test on the article's reported text numbers is indicator S9, and the terminal-digit test on the same data is indicator D34, so D22 is the first-and-second-digit Benford test on the pooled individual-patient values.

Theoretical background

D22 rests on the scale-invariance that gives rise to Benford's Law and on extending it to the second digit. A collection of numbers whose distribution is smooth over many powers of ten has leading digits distributed logarithmically, so digit d leads with probability equal to the width on a logarithmic axis of the interval whose values begin with d, which is the base-ten logarithm of one plus one over d. The same logic applied one place further gives the second-digit law, a flatter distribution that still slightly favours smaller digits, obtained by summing the leading-digit contributions. Genuine measurement data that spans a wide range obeys both laws closely; a generating process that does not inherit this structure, whether independent sampling from a bounded distribution or a model emitting familiar numbers, departs from them. Testing two digits rather than one roughly doubles the constraints the data must satisfy and, crucially, the failures tend to be correlated under fabrication, so the joint test is more powerful than either single-digit test. The mean absolute deviation provides a calibrated effect size for the first digit, while the chi-squared statistics test the significance of each digit's departure, and combining them with a sample-size-aware confidence reflects that Benford conformity is a large-sample phenomenon best trusted when many values are available. The mantissa arc test generalises the digit view to the entire significand: Benford conformity is equivalent to the base-ten log mantissa being uniform on the unit interval, so mapping each mantissa to an angle and measuring the length of the mean resultant vector tests that uniformity directly, catching a distorted significand distribution that leading-digit summaries can miss.

References

  1. Benford F. The law of anomalous numbers. Proceedings of the American Philosophical Society. 1938;78(4):551-572. https://www.jstor.org/stable/984802
  2. Diekmann A. Not the First Digit! Using Benford's Law to Detect Fraudulent Scientific Data. Journal of Applied Statistics. 2007;34(3):321-329. DOI: 10.1080/02664760601004940
  3. Nigrini MJ. Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection. Hoboken, NJ: Wiley; 2012. https://doi.org/10.1002/9781119203094
  4. Hill TP. A Statistical Derivation of the Significant-Digit Law. Statistical Science. 1995;10(4):354-363. DOI: 10.1214/ss/1177009869
  5. Crone G, Green CD. Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology. 2025;35(3):359-380. DOI: 10.1177/09593543241311861
  6. Bordewijk EM, Li W, van Eekelen R, Wang R, Showell M, Mol BW, van Wely M. Methods to assess research misconduct in health-related research: A scoping review. Journal of Clinical Epidemiology. 2021;136:189-202. DOI: 10.1016/j.jclinepi.2021.05.012
  7. Wilkinson J, Heal C, Antoniou GA, et al. A survey of experts to identify methods to detect problematic studies: stage 1 of the INveStigating ProblEmatic Clinical Trials in Systematic Reviews project. Journal of Clinical Epidemiology. 2024;175:111512. DOI: 10.1016/j.jclinepi.2024.111512
  8. Carlisle JB. Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia. 2017;72(8):944-952. DOI: 10.1111/anae.13938
  9. Cinelli C. benford.analysis: Benford Analysis for Data Validation and Forensic Analytics. R package version 0.1.5; 2018. https://cran.r-project.org/package=benford.analysis