Text-Table Drift
Detects discrepancies between statistics reported in the text and the corresponding values in tables, catching errors or manipulation where numbers do not match across the paper.
Technical description
Cross-references statistical values reported in the narrative text against corresponding values in tables. Extracts means, SDs, p-values, test statistics, and sample sizes from both text and tables, then matches corresponding values by context (variable name, comparison, time point). Flags discrepancies where the same statistic differs between its text and table representations.
How it works
Layer 1 (deterministic): Extracts statistics from text and tables separately. Matches corresponding values by context (variable name, group, time point). Compares matched pairs for exact or near-exact agreement. Flags discrepancies beyond rounding tolerance. Reports each mismatch with text and table locations.
Why this matters
In a legitimately authored paper, statistics in the text must match the corresponding values in tables because they come from the same analysis. Discrepancies arise from: reporting errors (wrong numbers copied), selective reporting (text highlights different statistics than tables show), or fabrication (text and tables constructed independently). Multiple text-table discrepancies are a strong indicator of data integrity problems.
Score thresholds
- 0-1
- Text and table values match consistently
- 2-3
- Minor discrepancies possibly from rounding
- 4-5
- Multiple discrepancies between text and table values
Limitations
Matching text statistics to table values requires accurate context parsing. Some discrepancies result from different rounding at different stages. Different subsets (e.g., per-protocol vs intention-to-treat) may show different values legitimately.