S13Statistical analysisStatistical ConsistencyLayer 1 (Deterministic)

Suspicious Rounding

Looks for rounding and precision patterns in reported statistics that are unusual in genuinely computed results: p-values reported exactly on the thresholds 0.05 or 0.01, standard deviations identical across several groups, and reported means whose number of decimal places varies widely. Each is a weak individual clue, but together they point to numbers entered or copied by hand rather than produced by analysis software.

Technical description

A deterministic screen for three rounding and precision irregularities, each contributing one flag; the flag count sets the score. (1) Round p-values: counts p-values equal to exactly 0.05 or 0.01 and flags when two or more occur, since a computed p-value almost never lands precisely on a threshold. (2) Identical SDs: groups (mean, SD, n) triplets by source and flags any source where three or more share an identical standard deviation, a known fingerprint of copied or invented summary statistics. (3) Precision inconsistency: counts the significant decimal places of each reported mean from a fixed-point representation (so very small or large means are not misread through scientific notation) and, with at least four triplets, flags a coefficient of variation of the decimal counts above 0.5.

How it works

Layer 1 (deterministic): the round-p-value check counts exact 0.05 or 0.01 values and flags at two or more; the identical-SD check tallies standard deviations within each source and flags any value occurring three or more times; the precision check counts significant decimals of each mean from a fixed-point string and flags a coefficient of variation above 0.5 with at least four triplets. Score by flag count: 0 gives 0.0, one gives 2.0, two gives 3.0, three gives 4.5, capped at 5.0. Round-p and identical-SD findings are warnings; the precision finding is informational. Metadata records round_p_count, identical_sd_groups, precision_issues, and precision_cv.

Why this matters

The fine texture of how numbers are rounded carries information about how they were produced. Garcia-Berthou and Alcaraz found a substantial fraction of reported results in leading journals are internally incongruent, much of it from rounding and transcription. The granularity of reported statistics underlies the GRIM family of consistency tests, which exploit that real means and SDs of given sample sizes can only take certain values at a given precision. Identical summary statistics across groups are a recognised fabrication signal in forensic re-analysis of trials. None of the three patterns proves misconduct alone, which is why they are weak flags whose concurrence is the meaningful signal, each reflecting how hand-entered or copied numbers differ from analysis-pipeline output.

Score thresholds

0: None of the three rounding or precision patterns is present.
2: One pattern: exact-threshold p-values, identical standard deviations, or inconsistent precision.
3: Two of the three patterns occur together.
4-5: All three patterns occur, a combination unusual in genuinely computed results.

Limitations

Each sub-check is a weak heuristic. A p-value reported as exactly 0.05 or 0.01 is often legitimate rounding or a bound (below 0.05) reported as a point value, which the indicator cannot distinguish. Identical standard deviations can occur legitimately when variables share a scale or rounding collapses nearby values, and the check uses exact equality. The precision check is weakest: reporting different variables to different natural precisions is normal, so a high coefficient of variation of decimal places has many benign causes. All checks depend on accurate extraction. The thresholds two, three, and 0.5 are directional. GRIM and GRIMMER granularity tests are S3 and S4, identical-statistic detection on individual-patient data is D8, and value duplication is S14.