Precision vs Instrument Mismatch
Checks whether the number of decimal places in each measured variable matches what the measuring instrument actually produces. A hematology analyser reports haemoglobin to one decimal place, so a column of haemoglobin values written to two decimals, such as 13.14 instead of 13.1, did not come from that instrument. Machine-generated data often gets this wrong, inventing more or fewer decimals than the device yields. The indicator matches each column to a dictionary of instrument precisions and flags columns whose decimal precision does not fit. It works on the individual-patient data.
Technical description
D17 is a deterministic screen comparing the reported decimal precision of each variable against the precision its measuring instrument is known to produce. It loads an instrument-precision dictionary mapping variables and their aliases to an expected number of decimal places and a source, then matches each numeric column of the individual-patient data, normalising the column name by removing unit decorations such as a trailing parenthetical or comma-qualifier so that a header like haemoglobin in grams per decilitre still matches. For a matched column it counts the decimal places of every value and takes the most common count as the column's actual precision. A mismatch is recorded when the actual precision exceeds the expected, which is always suspicious, or when it falls below the expected and the source is an instrument, since an instrument that always reports a fixed number of decimals should not yield fewer. The rate of mismatched columns among matched columns sets the score.
How it works
Each numeric column whose normalised name matches the dictionary is examined. The decimal count of each non-missing value is taken from a fixed-point string representation, and the mode of those counts is the column's actual precision. Too many decimals relative to the expected value is always a mismatch; too few is a mismatch only when the variable's source is an instrument, because a hand-derived or computed quantity may legitimately be reported to fewer places. The mismatch rate is the number of mismatched columns divided by the number of matched columns, requiring at least one match. The rate maps to the score: above 0.60 scores 4.0, above 0.40 scores 3.0, above 0.25 scores 2.0, above 0.10 scores 1.0, and otherwise 0. Beyond the modal comparison the indicator measures precision consistency, the share of values at the modal decimal count together with the normalised Shannon entropy of the decimal-count distribution; a matched column whose modal precision matches expectation but whose values nonetheless mix decimal counts, so that fewer than sixty percent share the mode, is recorded as a mismatch of an inconsistent-precision type, catching a fixed-precision instrument whose readings vary in decimals. Each mismatch produces a finding naming the column, its actual and expected precision, and the mismatch type. The metadata records the matched and mismatched counts, the split of mismatches into too-many-decimal, too-few-decimal, and inconsistent-precision types, the rate, the mean precision consistency across matched columns, and the per-column details.
Score thresholds
| Score | Meaning |
|---|---|
| 0 | Reported precision matches the instruments for the recognised variables. |
| 1 to 2 | A minority of recognised variables have a precision the instrument would not produce. |
| 3 to 5 | Most recognised variables are reported at the wrong precision, a strong fabrication signal. |
Why this matters
Measuring instruments report values at a fixed precision determined by their design, so the number of decimal places in genuine data is not a free choice but a fingerprint of the device. A haemoglobin analyser yields one decimal, a balance yields a fixed number of significant figures, and a manual count yields integers, so a value at the wrong precision could not have come from the stated instrument. This is a particularly useful tell for machine-generated data: Taloni and colleagues showed that a model fabricating a clinical dataset produces values whose form does not respect real measurement conventions [1]. The principle that the granularity of reported numbers must be consistent with how they were produced underlies the GRIM family of tests, which exploit the discrete set of values a given precision and sample size allow [2], and the forensic study of terminal digits, which reads the fine structure of measured numbers as evidence of their origin [3]. Precision mismatch is a deterministic, hard-edged version of the same idea: it does not estimate a probability but checks a value against the fixed precision its source must yield. Recent forensic re-analyses, scoping reviews, and trustworthiness instruments place reporting-precision and granularity checks among the standard screens for fabricated and machine-generated data [4, 5, 6, 7, 8].
Limitations
The check can only assess variables present in its instrument-precision dictionary and whose column name it matches after normalisation, so an unrecognised variable, an unusual alias, or a non-English header is skipped. It uses the mode of the decimal counts as the column's precision and additionally flags inconsistent precision when too few values share that mode, so mixed precision is detected rather than silently mischaracterised, though a column dominated by trailing-zero values that the decimal counter normalises away can still register a lower precision than intended. Reported data is sometimes legitimately rounded or rescaled after measurement, for example converting units or summarising, which changes the precision without indicating fabrication, so the too-few-decimals rule is restricted to instrument sources to limit false positives. Values stored as floating-point can carry artefactual digits, which the fixed-point string conversion mitigates but does not wholly remove. The thresholds on the mismatch rate are directional. The reported-statistic granularity tests are indicators S3 and S4 and the terminal-digit checks are indicators S8 and D34, so D17 stays on the decimal precision of measured columns against their instruments.
Theoretical background
D17 rests on the fact that measurement precision is a property of the instrument, not of the analyst. A device that quantises its output to a fixed number of decimal places produces values drawn from a grid of that spacing, so every genuine reading from it has at most that many decimals, and the most common decimal count across many readings equals the instrument's precision. Reporting more decimals than the instrument yields requires information the device never produced, so it can only come from invention, miscalculation, or an inappropriate transformation; reporting fewer can arise from honest rounding, which is why that direction is flagged only for instrument sources where the precision is expected to be exact. Counting decimals from a fixed-point representation, rather than the general format, avoids reading scientific-notation artefacts as zero decimals and avoids the floating-point noise that a naive string conversion introduces, so the mode is a faithful estimate of the column's reported precision. Aggregating across columns into a mismatch rate reflects that a single oddly-precise column might be a transcription quirk, whereas a high rate of wrong-precision columns describes a dataset whose numbers were not shaped by real instruments. Reading the consistency of the decimal counts, not only their mode, captures a complementary failure: a genuine instrument emits one fixed precision, so the entropy of its decimal-count distribution is zero, and a column that mixes precisions betrays values assembled rather than measured even when the most common count happens to be right.
References
- Taloni A, Scorcia V, Giannaccare G. Large Language Model Advanced Data Analysis Abuse to Create a Fake Data Set in Medical Research. JAMA Ophthalmology. 2023;141(12):1174-1175. DOI: 10.1001/jamaophthalmol.2023.5162
- Brown NJL, Heathers JAJ. The GRIM Test: A Simple Technique Detects Numerous Anomalies in the Reporting of Results in Psychology. Social Psychological and Personality Science. 2017;8(4):363-369. DOI: 10.1177/1948550616673876
- Mosimann JE, Dahlberg JE, Davidian NM, Krueger JW. Terminal digits and the examination of questioned data. Accountability in Research. 2002;9(2):75-92. DOI: 10.1080/08989620212969
- Carlisle JB. Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia. 2017;72(8):944-952. DOI: 10.1111/anae.13938
- Crone G, Green CD. Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology. 2025;35(3):359-380. DOI: 10.1177/09593543241311861
- Carlisle JB. False individual patient data and zombie randomised controlled trials submitted to Anaesthesia. Anaesthesia. 2021;76(4):472-479. DOI: 10.1111/anae.15263
- Bordewijk EM, Li W, van Eekelen R, Wang R, Showell M, Mol BW, van Wely M. Methods to assess research misconduct in health-related research: A scoping review. Journal of Clinical Epidemiology. 2021;136:189-202. DOI: 10.1016/j.jclinepi.2021.05.012
- Wilkinson J, Heal C, Antoniou GA, et al. A survey of experts to identify methods to detect problematic studies: stage 1 of the INveStigating ProblEmatic Clinical Trials in Systematic Reviews project. Journal of Clinical Epidemiology. 2024;175:111512. DOI: 10.1016/j.jclinepi.2024.111512