ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
R4Statistical analysisMethodological CoherenceLayer 2 (Contextual)

Protocol Fidelity

Checks whether the statistical analysis described in the methods section matches what was actually reported in the results, detecting undisclosed analysis changes.

Technical description

R4 cross-checks the design claims a paper makes against evidence elsewhere in the text. It detects an intention-to-treat (ITT) claim, a per-protocol mention, the blinding level (single-, double-, triple-blind, or open-label), and the randomisation ratio, matching the compound terms across the hyphen, space, and typographic dash characters typeset documents use. When ITT is claimed it extracts the enrolled/randomised count and the analysed/included count and computes their ratio, since an ITT analysis should include essentially all randomised participants. When a double- or triple-blind design is claimed it scans the results for unblinding or unmasking, which contradict an undisturbed blind unless explained. It also detects a modified intention-to-treat (mITT) claim, flagging it as a weaker variant that permits post-randomisation exclusions and is linked to larger, potentially biased effects (Abraha 2015). Each contradiction yields a finding and raises the score.

How it works

Layer 2 (contextual): protocol claims are extracted by regex (ITT full and abbreviated, per-protocol, blinding terms, a numeric randomisation ratio such as 1:1), with the hyphenated terms matching any common dash character so a term typeset with an en dash or non-breaking hyphen is still recognised. No protocol information returns zero. When ITT is claimed, the first enrolled/randomised and analysed/included counts are read; if both positive and the analysed-to-enrolled ratio is below 0.85, a warning is added and the score set to at least 4.0. For double- or triple-blind, the results section (or full text if unsegmented) is searched for unblinding/unmasking; a match adds a warning and at least 2.0. The score is the maximum across checks, any flag raising at least 2.0, capped at 5.0. When an ITT claim is present and both an enrolled and an analysed count are found, their ratio (the ITT fidelity ratio) is recorded in the metadata as a diagnostic alongside the two counts. A modified intention-to-treat (mITT) claim is detected separately and adds a warning raising the score to at least 2.0, with the metadata recording whether the analysis is modified rather than strict ITT.

Why this matters

A study's stated design and its reported conduct should agree, and where they do not the discrepancy is a quality flaw and a possible sign of after-the-fact change. CONSORT requires specifying the analysis population, blinding, and randomisation and reconciling the numbers analysed with those randomised. Intention-to-treat is the claim most often made loosely, with surveys finding the term used with conflicting meanings and many claiming trials having excluded participants, so a stated ITT whose analysed count falls short of enrollment is a documented inconsistency; more generally, published analyses frequently diverge from the declared protocol.

Score thresholds

0
Protocol claims are consistent with the reported counts and statements, or none were found
2
A minor inconsistency, such as a blinding claim contradicted by an unblinding mention
4-5
An intention-to-treat claim contradicted by an analysed count well below the enrolled count

Limitations

The checks rest on pattern extraction, so unusually phrased claims are missed and the ITT counts read are the first enrolled and analysed numbers found, which can be wrong when several populations are reported (for example mistaking a per-protocol subset for the ITT analysed count). Legitimate pre-planned unblinding, such as an independent committee's interim look, is flagged like an improper one, so a blinding finding prompts reading the explanation. The randomisation ratio is the first colon-separated integer pair, which can capture a clock time, and is recorded but not scored. The indicator reads text, not the participant-flow diagram where definitive counts reside. Sample-size drift is indicator R2 and test appropriateness is indicator R1; R4 focuses on the consistency of the analysis-population and blinding claims with the rest of the report.

References

  1. Schulz KF, Altman DG, Moher D (CONSORT Group). (2010). CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMJ
  2. Hollis S, Campbell F. (1999). What is meant by intention to treat analysis? Survey of published randomised controlled trials. BMJ
  3. Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. (2004). Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA
  4. Mansournia MA, Collins GS, Nielsen RO, Nazemipour M, Jewell NP, Altman DG, Campbell MJ. (2021). CHecklist for statistical Assessment of Medical Papers: the CHAMP statement. British Journal of Sports Medicine 55(18):1002-1003
  5. Parker L, Boughton S, Lawrence R, Bero L. (2022). Experts identified warning signs of fraudulent research: a qualitative study to inform a screening tool. Journal of Clinical Epidemiology 151:1-17
  6. Carlisle JB. (2021). False individual patient data and zombie randomised controlled trials submitted to Anaesthesia. Anaesthesia 76(4):472-479
  7. Wilkinson J, Heal C, Antoniou GA, et al.. (2024). A survey of experts to identify methods to detect problematic studies: stage 1 of the INveStigating ProblEmatic Clinical Trials in Systematic Reviews project. Journal of Clinical Epidemiology 175:111512
  8. Crone G, Green CD. (2025). Tools of the data detective: A review of statistical methods to detect data and result anomalies in psychology. Theory & Psychology 35(3):359-380
  9. Bordewijk EM, Li W, van Eekelen R, et al.. (2021). Methods to assess research misconduct in health-related research: A scoping review. Journal of Clinical Epidemiology 136:189-202
  10. Abraha I, Cherubini A, Cozzolino F, et al.. (2015). Deviation from intention to treat analysis in randomised trials and treatment effect estimates: meta-epidemiological study. BMJ 350:h2445