D32: GRIM/SPRITE IPD Consistency

Technical description

GRIM, Granularity-Related Inconsistency of Means, observes that for an integer-scale variable the mean over N participants times N must equal the integer sum of responses, so a mean whose product with N is not a whole number is impossible. D32 applies this to the individual-patient data (IPD) rather than to text-extracted summaries. It identifies integer-like columns, those where at least ninety percent of values lie within 0.01 of a whole number, requires at least two and at least ten rows, and tests each: with a group column it checks each group's mean times size against an integer; otherwise it checks the overall mean against the column size. A SPRITE-style range check confirms the group mean lies within the observed range. Because the means are recomputed from the IPD, a GRIM failure means the integer-looking column in fact contains non-integer values. Because GRIM on raw IPD is near-vacuous, the discriminating addition is a SPRITE variance-feasibility test on text-reported summaries: a reported mean for an integer variable must lie within the IPD's observed range, and by the Bhatia-Davis bound its reported SD cannot exceed sqrt((max-mean)(mean-min)).

How it works

Layer 2 (contextual): a column is integer-like when at least ninety percent of values are within 0.01 of their nearest integer. A group column is found by matching name tokens against group, arm, treatment, condition, or cohort. For each integer-like column and each group of at least two values, the group mean times group size is compared against its nearest whole number, and a deviation of 0.001 or more is a GRIM violation; without a group column the overall mean and size are used. A SPRITE-style check flags a group mean outside the observed range. The violation rate maps to the score (above thirty percent gives 4.0, above fifteen 3.0, above five 2.0, any violation 1.0), and a range violation adds 1.0, capped at 5.0. Metadata records the integer columns, the group column, the total checks, the GRIM and SPRITE violation counts and rates, and per-finding details. For each text-reported triplet matching an integer column, the reported mean is checked against the column range and the reported SD against the Bhatia-Davis bound; impossible summaries add to the score and are recorded as mean-range and SPRITE-variance violations.

Why this matters

GRIM is an established forensic tool: a reported mean that is not reachable as an integer sum divided by the sample size cannot have come from the integer data it claims to summarise, and SPRITE extends this by reconstructing candidate integer datasets from the mean, dispersion, size, and range. Applying them to the raw IPD is stronger than applying them to a paper's printed means, because it cannot be evaded by selective reporting and it surfaces a column that presents as an integer scale but whose values are subtly non-integer, a signature of generated or altered data.

Score thresholds

0-1: Integer-scale columns give whole-number group sums, as genuine integer data must
2-3: A meaningful fraction of integer-like columns yield impossible means
4-5: Most checks fail, or a mean lies outside the observed range, indicating values that are not the integers they appear to be

Limitations

The signal arises only when an integer-like column contains values that are not exactly integers, so columns stored as exact whole numbers pass by construction and fabrication preserving integer arithmetic is not detected. The integer-detection tolerance of 0.01 is looser than the GRIM tolerance of 0.001, so values genuinely near but not on the integers (from rounding or float storage) can be flagged, making a flag a prompt to inspect raw values. The SPRITE-style range check compares a group mean against the range of the same values it was computed from, which it almost always satisfies, so it is a guard rather than a test. The token match can miss an unusually named group column, in which case the overall-column check is used. Granularity of text-reported means is indicators S3 and S4; D32 focuses on the reconstructed integer arithmetic across the IPD.

GRIM/SPRITE IPD Consistency

Technical description

How it works

Why this matters

Score thresholds

Limitations

References