ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
S20Statistical analysisStatistical ConsistencyLayer 1 (Deterministic)

Text-Table Drift

Detects discrepancies between statistics reported in the text and the corresponding values in tables, catching errors or manipulation where numbers do not match across the paper.

Technical description

Cross-references statistical values reported in the narrative text against corresponding values in tables. Extracts means, SDs, p-values, test statistics, and sample sizes from both text and tables, then matches corresponding values by context (variable name, comparison, time point). Flags discrepancies where the same statistic differs between its text and table representations.

How it works

Layer 1 (deterministic): Extracts statistics from text and tables separately. Matches corresponding values by context (variable name, group, time point). Compares matched pairs for exact or near-exact agreement. Flags discrepancies beyond rounding tolerance. Reports each mismatch with text and table locations.

Why this matters

In a legitimately authored paper, statistics in the text must match the corresponding values in tables because they come from the same analysis. Discrepancies arise from: reporting errors (wrong numbers copied), selective reporting (text highlights different statistics than tables show), or fabrication (text and tables constructed independently). Multiple text-table discrepancies are a strong indicator of data integrity problems.

Score thresholds

0-1
Text and table values match consistently
2-3
Minor discrepancies possibly from rounding
4-5
Multiple discrepancies between text and table values

Limitations

Matching text statistics to table values requires accurate context parsing. Some discrepancies result from different rounding at different stages. Different subsets (e.g., per-protocol vs intention-to-treat) may show different values legitimately.