D20Statistical analysisFabrication DetectionLayer 2 (Contextual)

Conditional Independence Violations

Checks specific pairs of variables against the relationships physiology says they must have. Some pairs should correlate (such as systolic and diastolic blood pressure) and in a known direction; others should be related only through a third variable, so controlling for that mediator makes the correlation disappear. The indicator uses a dictionary of domain-knowledge triples and flags pairs whose correlation is missing, has the wrong sign, or persists after the mediator is controlled for.

Technical description

A contextual screen testing named variable pairs against expected dependence or conditional independence from a domain-knowledge dictionary. Each entry specifies two variables, an expected correlation range, whether they should be independent after conditioning on a third mediator, and aliases. For a dependent pair, it computes the Pearson correlation and flags it when the magnitude is much weaker than expected OR when the correlation runs in the wrong direction (an expected-negative relationship appearing positive or the reverse, which a magnitude-only check would miss). For a mediated pair, it computes the unconditional correlation and, when meaningful, the partial correlation controlling for the mediator, flagging when a strong residual correlation remains. The failure rates among dependent and mediated pairs set the score.

How it works

Layer 2 (contextual): each triple whose variables map to numeric columns with at least ten complete rows is checked. A dependent pair fails when its correlation is below the expected lower magnitude by more than 0.10, or when the expected range lies wholly on one side of zero but the observed correlation is meaningfully on the other (wrong sign). A mediated pair is checked only when its unconditional correlation exceeds 0.15, and fails when the partial correlation (regressing out the mediator) both exceeds 0.20 and is significantly nonzero by the Fisher z-transform test at five percent. The score sums a dependent-pair contribution (bands at 20, 40, 60 percent failure) and a more heavily weighted mediated-pair contribution (bands at 25, 50 percent), capped at 5.0. Findings name the variables, observed and expected correlations, and any residual partial. Metadata records the per-type and total checked and failed counts, the failure rates for both pair types, and the smallest Fisher z partial-correlation p-value among mediated pairs.

Why this matters

Domain knowledge specifies not just that variables relate but exactly how (strength, direction, conditional structure), and these specifics are far harder to fabricate than a generic correlation. Taloni and colleagues showed a model can fabricate a clinical dataset whose variables fail to carry realistic relationships, and the use of correlation structure to separate genuine from invented data is established (Al-Marzouki and colleagues; Simonsohn). D20 sharpens this with literature-anchored expectations: an expected coupling that is absent, a known negative relationship appearing positive, or a mediation that fails to remove a correlation are each specific, checkable contradictions. The wrong-direction check is especially hard to evade, because reproducing the correct sign of every physiological relationship requires the generator to encode the underlying biology.

Score thresholds

0: Expected correlations are present and mediated pairs become independent as predicted.
1-2: Some expected relationships are missing, reversed, or fail to mediate.
3-5: Many domain-expected dependence or independence relationships are violated.

Limitations

Can only test variable pairs in its triples dictionary whose names it matches to columns, so unencoded relationships or unrecognisably named columns are not examined, and the absence of violations is only as informative as the pairs checkable. Correlation and partial correlation are linear, so a genuine non-linear dependence can appear weak and be misread as missing, and the partial correlation controls for a single linear mediator only. The expected ranges and thresholds (unconditional minimum 0.15, partial threshold 0.20, sign threshold 0.10, weakness margin 0.10) are from the literature but applied as fixed cutoffs. Real populations can differ from reference ranges for legitimate case-mix reasons, so a flag prompts investigation. The global correlation-matrix structure is assessed by D1 and S17 and conditional correlations broadly by D11.