ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
G4-imgImage forensicsChart AnalysisLayer 1 (Deterministic)

Error Bars

Analyzes error bars in charts for suspicious uniformity or improper centering, which can indicate that the error bars were fabricated rather than calculated from real data.

Technical description

Detects bars and error-bar segments and sums four geometric signals into a 0 to 5 score (capped): (1) centering (error-bar center within 3 px of the bar top, else a centering issue adding min(2.0, issues x 0.5)); (2) length uniformity (coefficient of variation of error-bar lengths below 0.05 adds 2.0, escalated to a pixel-identical copy-paste finding when every length is the exact same integer); (3) symmetric stamped template (each error bar symmetric within 1 px AND lengths uniform adds 1.0, since symmetry alone is normal for mean plus or minus SD/SEM/CI on a linear axis); (4) coverage consistency (with at least 4 bars, error bars on fewer than half of them adds 1.0). Requires at least two detected error bars.

How it works

Layer 1 (deterministic). Detects bars and error-bar line segments and links each error bar to its nearest bar. Measures the offset of each error-bar center from its bar top; the coefficient of variation of error-bar lengths, with an exact-duplicate check for pixel-identical lengths; per-bar symmetry, counted only together with length uniformity; and the share of bars that carry an error bar. Sums the four contributions, caps at 5.0, and reports each anomaly as a finding.

Why this matters

Genuine error bars (standard deviation, standard error of the mean, or confidence interval) vary in length across conditions, sit centered on the bar top, and appear on every measured bar. Bar charts with error bars are a known weak point: a review of 703 articles showed the whiskers hide the underlying distribution, and an audit found 64 percent of cardiovascular articles misused the SEM to obtain smaller, more reassuring bars. Decorative or fabricated error bars leave geometric traces, identical or pixel-identical lengths, a symmetric stamped template, misalignment with the bar top, or presence on only some bars, that forensic-statistics reviews tie to the excessive uniformity of invented data.

Score thresholds

0-1
Error bars of varying length, centered on their bar tops, present on every bar
2-3
One signal: off-center error bars, near-identical lengths, a uniform symmetric template, or error bars on only some bars
4-5
Several signals together, such as a pixel-identical symmetric template that is also off-center or partial, consistent with decorative or fabricated error bars

Limitations

Analyses a chart only when at least two error bars are detected, and depends on the bar and error-bar detection recovering the geometry, so faint, very short, or occluded error bars are not scored. Thresholds are directional, and some patterns occur honestly: genuinely similar variability yields similar lengths, and a control bar may legitimately lack error bars, so the uniformity and coverage signals are review cues rather than proof. Symmetry is counted only in the uniform case to avoid flagging the ordinary symmetric error bar. Whether the reported mean and SD are arithmetically possible, and whether a bar's drawn height matches its printed value, live in sibling chart indicators.

References

  1. Weissgerber TL, Milic NM, Winham SJ, Garovic VD. (2015). Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biology 13(4):e1002128
  2. Crone G, Green CD. (2025). Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology
  3. Wullschleger M, Aghlmandi S, Egger M, Zwahlen M. (2014). High Incorrect Use of the Standard Error of the Mean (SEM) in Original Articles in Three Cardiovascular Journals Evaluated for 2012. PLoS One 9(10):e110364