ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
S15Statistical analysisStatistical ConsistencyLayer 1 (Deterministic)

No Imperfections

Flags datasets that look too clean to be real. Genuine data collection produces missing values, occasional extreme observations, and dropout, so a study whose tables have no gaps, whose numeric columns contain no outliers, and whose text claims perfect follow-up or zero attrition is unusual enough to warrant a second look. The indicator runs three checks for these signs of suspicious perfection and combines them.

Technical description

A deterministic screen for the absence of real-world imperfections, combining three sub-checks whose activation pattern sets the score. (1) Zero missing: across all tables, if there are more than fifty cells and none is empty or a missing marker, the data are flagged as suspiciously complete; the recognised markers include blanks, N/A variants, the placeholder symbols journals use (em dash, en dash, hyphen, middle dot, bullet, ellipsis), and textual markers like not-detected and not-reported, so a table that does mark its gaps is not mistaken for a complete one. (2) Zero outliers: in any numeric column with at least ten parsed values, it checks whether every value lies within the robust band median +/- 3*MAD (the median absolute deviation scaled by 1.4826, per Leys et al. 2013, 2019), a rule that resists the masking by which an outlier inflates the mean and SD and so hides itself. (3) Text signals: it scans for perfection claims such as no dropout, 100% follow-up, no missing data, zero attrition, complete data, no adverse events.

How it works

Layer 1 (deterministic): the missing-data check tallies all table cells, classifies a cell as missing when its trimmed lower-cased text matches a known marker, and flags zero-missing only when the cell count exceeds fifty and the missing count is zero (warning). The outlier check parses each column's numerics and, for columns with at least ten values, tests whether all lie within median +/- 3*MAD with the MAD scaled by 1.4826 (a column whose scaled MAD is zero, meaning half or more values are identical, is ignored); if at least one column was checked and none had an outlier, the sign is set (info). The text check matches a fixed phrase list (info). Score: all three signs give 5.0; zero-missing plus zero-outliers give 4.0; zero-missing alone 2.5; a text claim alone or zero-outliers alone 1.5; none 0. Metadata records zero_missing, zero_outliers, and text_signals.

Why this matters

Real data are messy, and the absence of that mess is informative. Simonsohn showed fabrication can be exposed by statistics alone when reported results are too clean or too similar to have come from genuine sampling, identifying two cases from the implausibly low variability of their summary statistics. Carlisle's forensic re-analyses treat improbably consistent and complete data as an integrity signal across many trials, and the classic biostatistical account of fraud lists the lack of normal imperfections (complete follow-up, absent outliers) among the patterns distinguishing invented from genuine data. A fabricator focused on a clean publishable result rarely adds the missing values, extremes, and dropout real studies accumulate, so their absence, especially in combination, is a coherent fingerprint. Because each sign has innocent explanations, S15 reserves its highest scores for their co-occurrence.

Score thresholds

0
The data show the normal imperfections of real collection.
1-2
A single weak sign: a perfection claim in the text, or numeric columns with no outliers.
2-3
Tables with more than fifty cells and no missing values at all.
4-5
Several signs together: complete data, no outliers, and explicit perfection claims.

Limitations

Each sub-check is a heuristic with benign explanations, so a high score prompts inspection rather than proving misconduct. The zero-outlier check is weakest: for a small sample, having no value beyond the robust three-MAD band is the normal expectation, not a surprise, so the sign is meaningful only in larger columns, and S15 never relies on it alone for a high score; the median and MAD band is itself robust to the very outlier it seeks, unlike a mean and SD band. Genuinely complete data exist (small, well-managed, or registry studies), so zero missing is not proof. The text-phrase list is fixed and literal, so it misses paraphrases and can match a legitimate fact such as a small safety study that truly saw no adverse events. The checks depend on correct table parsing and a complete marker list. The thresholds (fifty cells, ten values, three standard deviations) are directional. Outlier and distributional checks on individual-patient data are in the D series.

References

  1. Simonsohn U. (2013). Just Post It: The Lesson From Two Cases of Fabricated Data Detected by Statistics Alone. Psychological Science 24(10):1875-1888
  2. Carlisle JB. (2017). Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia 72(8):944-952
  3. Buyse M, George SL, Evans S, et al.. (1999). The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. Statistics in Medicine 18(24):3435-3451
  4. Leys C, Ley C, Klein O, Bernard P, Licata L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology 49(4):764-766
  5. Leys C, Delacre M, Mora YL, Lakens D, Ley C. (2019). How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration. International Review of Social Psychology 32(1):5
  6. Rousseeuw PJ, Hubert M. (2018). Anomaly detection by robust statistics. WIREs Data Mining and Knowledge Discovery 8(2):e1236
  7. Proschan MA, Shaw PA. (2020). Diagnosing fraudulent baseline data in clinical trials. PLoS ONE 15(10):e0239121
  8. Carlisle JB. (2021). False individual patient data and zombie randomised controlled trials submitted to Anaesthesia. Anaesthesia 76(4):472-479
  9. Bordewijk EM, Li W, van Eekelen R, Wang R, Showell M, Mol BW, van Wely M. (2021). Methods to assess research misconduct in health-related research: A scoping review. Journal of Clinical Epidemiology 136:189-202
  10. Wilkinson J, Heal C, Antoniou GA, et al.. (2024). A survey of experts to identify methods to detect problematic studies: stage 1 of the INveStigating ProblEmatic Clinical Trials in Systematic Reviews project. Journal of Clinical Epidemiology 175:111512
  11. Crone G, Green CD. (2025). Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology 35(3):359-380