R11Statistical analysisMethodological CoherenceLayer 2 (Contextual)

Missing Data

Evaluates whether the paper adequately reports and handles missing data, as ignoring missing data can lead to biased results and is a sign of poor methodological rigor.

Technical description

R11 grades the quality of the missing-data handling described in the text. It searches for named methods (multiple imputation or MICE, last observation carried forward, complete-case or listwise or pairwise deletion, generic imputation), for a sensitivity analysis, for the missingness-mechanism assumptions (missing completely at random, at random, not at random), and for a stated missing-data percentage. The assumption acronyms are matched case-sensitively so the month abbreviation Mar or the word mar is not mistaken for the MAR assumption. The score grades the approach from best to worst: multiple imputation with a sensitivity analysis, multiple imputation alone, complete-case with a small percentage, complete-case with unknown or high percentage and LOCF without a sensitivity analysis, and no method at all. Independently of the grade, it counts the TARMOS reporting elements present (amount, method, mechanism, sensitivity).

How it works

Layer 2 (contextual): the text is matched against method, sensitivity-analysis, assumption, and missing-percentage patterns. Multiple imputation with a sensitivity analysis scores 0.0, multiple imputation alone 1.0. Complete-case analysis scores 1.5 with a stated missing percentage below five and 2.5 otherwise (with a finding). LOCF scores 2.0 with a sensitivity analysis and 3.0 without (with a finding). A generic imputation or missingness-assumption mention with no stronger method scores 2.0. No method at all scores 3.5 with a finding. Capped at 5.0. Metadata records the primary method found, whether a sensitivity analysis was present, the largest stated missing percentage and how many were reported, and whether a missingness mechanism was stated. It also counts the TARMOS reporting elements (amount, method, mechanism, sensitivity); a named but incompletely reported method that is not best practice gets an informational finding listing the missing elements, and the metadata records the completeness count and per-element status.

Why this matters

Almost all studies have missing data and how it is handled can change the conclusions, so the method and its reporting are part of validity. Multiple imputation is preferred to ad hoc methods when reported credibly; the missingness mechanism (completely at random, at random, not at random) determines which method is valid; and last observation carried forward rests on the implausible assumption that a value stays fixed after dropout and can bias results, so it warrants a sensitivity analysis. A study describing no handling at all leaves the reader unable to judge whether missing data distorted the result.

Score thresholds

0-1.5: A principled method: multiple imputation, or complete-case analysis with little missing data
2-3: A weaker approach: complete-case with high or unstated missingness, or last-observation-carried-forward
3.5-5: No missing-data handling method is described

Limitations

Detection is keyword-based, so a method described unconventionally is missed (scored as no handling) and a method named in passing or in the background is credited. The missing percentage taken for scoring is the largest stated, since the worst-affected variable governs complete-case adequacy, but a complete-case analysis with genuinely low yet unstated missingness is still scored as if high. The grading reflects general preferences rather than a given study's specifics, where a simpler method can be appropriate, so a score is guidance. The indicator reads the text and does not confirm the described method was applied or applied correctly. The structure of missingness within the individual-patient data is indicator D29; R11 focuses on the description and adequacy of the missing-data handling in the report.