SPRITE Test (Table)
Asks whether a reported mean and standard deviation could come from real bounded data such as a Likert scale. SPRITE searches by simulation for an integer sample reproducing the statistics; if none exists they are impossible. An exact variance bound also rejects standard deviations that are too large for a given mean near the ends of the scale.
Technical description
Extracts mean/SD/N triplets, detects the scale from headers or context (default 1-7 Likert), and for each triplet runs the SPRITE Monte Carlo reconstruction plus an analytic maximum-variance bound. By Popoviciu's inequality, sharpened by Bhatia and Davis, a distribution bounded on [scale_min, scale_max] with mean m has variance at most (m - scale_min)(scale_max - m), so the sample SD cannot exceed sqrt of that times n/(n-1). A triplet fails if the simulation finds no matching integer sample or the analytic cap is exceeded; the cap is decisive near scale boundaries where the loose midpoint fallback would pass.
How it works
Layer 2 (simulation): reads mean/SD/N triplets and the scale, then for each draws integer samples nudged toward the reported mean and checks whether the SD can be matched; independently it applies the closed-form variance cap (m - scale_min)(scale_max - m). A triplet fails if neither a matching sample is found nor the SD lies within the cap. Score is 0 for no failure, 4.0 for one, 4.5 for two or more.
Why this matters
SPRITE was designed to catch fabricated summary statistics that pass simpler tests, by reconstructing candidate samples from the reported mean, SD, sample size, and scale and exposing combinations that no real bounded sample can produce; it has helped unravel several high-profile fabrication cases. It is the joint, distribution-level member of the granularity family with GRIM and GRIMMER, catching pairs of statistics each individually plausible but impossible together. For bounded instruments, an impossible standard deviation is strong evidence the numbers were not computed from data.
Score thresholds
- 0-1
- Every tested mean and standard deviation is reproducible by a real bounded integer sample
- 2-3
- Reserved; SPRITE failures are scored at the higher band
- 4-5
- One or more mean and SD pairs cannot arise from any integer sample on the scale, consistent with fabricated statistics
Limitations
SPRITE is meaningful only for data on a known bounded integer scale, so the scale must be detected correctly; a wrong or defaulted scale changes the verdict, and continuous or unbounded measures are out of scope. The simulation is stochastic and bounded in effort, so a positive result is treated as possible and only the analytic cap proves impossibility with certainty. It depends on OCR for the statistics and the scale. Subscale means, reverse-coded items, and means over unequal groups can fail legitimately. As a Layer 2 simulation it is slower and less deterministic than the Layer 1 granularity tests. The chart-read reconstruction is indicator G12, and the separate mean and SD tests are indicators T2 and T3.
References
- Heathers JAJ, Anaya J, van der Zee T, Brown NJL. (2018). Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE). PeerJ Preprints 6:e26968v1
- Bhatia R, Davis C. (2000). A better bound on the variance. The American Mathematical Monthly 107(4):353-357
- Brown NJL, Heathers JAJ. (2017). The GRIM Test: A Simple Technique Detects Numerous Anomalies in the Reporting of Results in Psychology. Social Psychological and Personality Science 8(4):363-369