G3-imgImage forensicsChart AnalysisLayer 1 (Deterministic)

Scale/Axis Incoherence

Checks whether chart axes are properly scaled and consistent, detecting manipulated axes where tick marks are unevenly spaced or values do not progress logically.

Technical description

Detects axis lines and ticks, OCR-reads the numeric label at each tick, and tests the (pixel position, value) pairs against six structural expectations of a real numeric axis: (1) monotonicity (values run one direction; non-monotone adds 2.5, x-axis requires at least three unique values to avoid categorical false positives); (2) spacing regularity (coefficient of variation of consecutive gaps at least 0.1 in both linear and log space, log accepted in either direction, adds 1.5); (3) scale-aware transfer function (best R-squared of pixel-to-value, fitting both linear and log10 scales; below 0.95 adds 1.0); (4) inverted orientation (clean linear axis with |r| at least 0.9 whose slope sign reverses convention, adds 1.5); (5) truncated value axis (bar charts only, all-positive axis whose min/max baseline ratio exceeds 0.05, adds 1.0); (6) dual y-axis (independent left and right numeric label columns, adds 1.0). Contributions are summed and capped at 5.0.

How it works

Layer 1 (deterministic). Detects axis lines and ticks, OCR-reads tick labels, and evaluates six tests: monotonicity of the labeled values; spacing regularity via the coefficient of variation of consecutive gaps in linear and logarithmic space; a scale-aware pixel-to-value transfer function that keeps the better of a linear or a log10 fit; inversion via the slope sign of value against pixel position on a cleanly linear axis; truncation of an all-positive value axis on a detected bar chart; and dual y-axis detection from independent left and right numeric label columns. Sums the contributions, caps at 5.0, and reports each violation as a finding.

Why this matters

Axis manipulation is the most common family of chart deception. A truncated value axis (one that does not start at zero) exaggerates the differences between bars; an inverted axis flips the direction a trend appears to move; a dual y-axis overlays two unrelated series to suggest a correlation the data does not support; non-monotone or irregularly spaced labels betray a hand-built rather than software-rendered scale. Each is a recurring entry in the misleading-visualization taxonomy, and each is a deterministic test on the numeric labels.

Score thresholds

0-1: Monotone axes, regular linear or log spacing, clean pixel-to-value mapping, conventional orientation, zero-based bar axis, single y-scale
2-3: One manipulation: inverted axis, truncated bar baseline, dual y-axis, chaotic spacing, or a poor transfer fit
4-5: A non-monotone axis, or several manipulations together, consistent with a fabricated or deliberately misleading chart

Limitations

Analyses an axis only when OCR recovers at least two numeric labels from it, so purely categorical axes are scored on the non-numeric checks alone. Thresholds are directional, and some flagged designs are legitimate in context: a non-zero baseline suits data that never approaches zero, an inverted axis is conventional in a few fields, and a dual y-axis is sometimes reasonable, so the value-axis checks are review cues rather than proof. Logarithmic axes are handled explicitly by the scale-aware spacing and transfer tests. Localized pixel editing, error-bar fabrication, and bar-height-versus-label mismatch live in sibling chart indicators.