Scale/Axis Incoherence
Checks whether chart axes are properly scaled and consistent, detecting manipulated axes where tick marks are unevenly spaced or values do not progress logically.
Technical description
Detects axis lines and ticks, OCR-reads the numeric label at each tick, and tests the (pixel position, value) pairs against six structural expectations of a real numeric axis: (1) monotonicity (values run one direction; non-monotone adds 2.5, x-axis requires at least three unique values to avoid categorical false positives); (2) spacing regularity (coefficient of variation of consecutive gaps at least 0.1 in both linear and log space, log accepted in either direction, adds 1.5); (3) scale-aware transfer function (best R-squared of pixel-to-value, fitting both linear and log10 scales; below 0.95 adds 1.0); (4) inverted orientation (clean linear axis with |r| at least 0.9 whose slope sign reverses convention, adds 1.5); (5) truncated value axis (bar charts only, all-positive axis whose min/max baseline ratio exceeds 0.05, adds 1.0); (6) dual y-axis (independent left and right numeric label columns, adds 1.0). Contributions are summed and capped at 5.0.
How it works
Layer 1 (deterministic). Detects axis lines and ticks, OCR-reads tick labels, and evaluates six tests: monotonicity of the labeled values; spacing regularity via the coefficient of variation of consecutive gaps in linear and logarithmic space; a scale-aware pixel-to-value transfer function that keeps the better of a linear or a log10 fit; inversion via the slope sign of value against pixel position on a cleanly linear axis; truncation of an all-positive value axis on a detected bar chart; and dual y-axis detection from independent left and right numeric label columns. Sums the contributions, caps at 5.0, and reports each violation as a finding.
Why this matters
Axis manipulation is the most common family of chart deception. A truncated value axis (one that does not start at zero) exaggerates the differences between bars; an inverted axis flips the direction a trend appears to move; a dual y-axis overlays two unrelated series to suggest a correlation the data does not support; non-monotone or irregularly spaced labels betray a hand-built rather than software-rendered scale. Each is a recurring entry in the misleading-visualization taxonomy, and each is a deterministic test on the numeric labels.
Score thresholds
- 0-1
- Monotone axes, regular linear or log spacing, clean pixel-to-value mapping, conventional orientation, zero-based bar axis, single y-scale
- 2-3
- One manipulation: inverted axis, truncated bar baseline, dual y-axis, chaotic spacing, or a poor transfer fit
- 4-5
- A non-monotone axis, or several manipulations together, consistent with a fabricated or deliberately misleading chart
Limitations
Analyses an axis only when OCR recovers at least two numeric labels from it, so purely categorical axes are scored on the non-numeric checks alone. Thresholds are directional, and some flagged designs are legitimate in context: a non-zero baseline suits data that never approaches zero, an inverted axis is conventional in a few fields, and a dual y-axis is sometimes reasonable, so the value-axis checks are review cues rather than proof. Logarithmic axes are handled explicitly by the scale-aware spacing and transfer tests. Localized pixel editing, error-bar fabrication, and bar-height-versus-label mismatch live in sibling chart indicators.
References
- Tonglet J, Zimny J, Tuytelaars T, Gurevych I. (2026). Is this chart lying to me? Automating the detection of misleading visualizations. ACL 2026 (arXiv:2508.21675)
- Lalai HN, Shah RS, Pfister H, Varma S, Guo G. (2026). When Visuals Aren't the Problem: Evaluating Vision-Language Models on Misleading Data Visualizations. arXiv:2603.22368
- Chen Z, Song S, Shum K, Lin Y, Sheng R, Wang W, Qu H. (2025). Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering. EMNLP 2025 (arXiv:2503.18172)
- Mahbub R, Islam MS, Laskar MTR, Rahman M, Nayeem MT, Hoque E. (2025). The Perils of Chart Deception: How Misleading Visualizations Affect Vision-Language Models. IEEE VIS 2025 (arXiv:2508.09716)
- Luo J, Li Z, Wang J, Lin CY. (2021). ChartOCR: Data Extraction from Charts Images via a Deep Hybrid Framework. IEEE/CVF WACV 2021
- Cliche M, Rosenberg D, Madeka D, Yee C. (2017). Scatteract: Automated Extraction of Data from Scatter Plots. ECML PKDD 2017, LNCS 10534 (arXiv:1704.06687)