D9Statistical analysisFabrication DetectionLayer 2 (Contextual)

Timeline Implausibility

Reads the key dates a paper reports (ethics approval, the start and end of data collection, trial registration, submission) and checks they fall in a possible order. Ethics approval cannot follow the start of data collection, collection cannot end before it begins, and a trial should be registered before its results exist. It also estimates the recruitment rate from the sample size and collection window and flags a rate implausibly fast for a single site.

Technical description

A contextual screen on the reported chronology. It extracts month-and-year dates from the text for ethics/IRB approval, the data-collection range, trial registration, and submission/receipt, and checks the expected ordering: ethics approval not after collection start, collection start before end, collection end before submission, and registration before collection end; each hard violation is a serious flag. With ordering intact it notes tight-but-possible gaps (ethics-to-collection or collection-to-submission within one month) as low-level observations. When a collection duration and sample size are both available it computes a recruitment rate in patients per month and flags a rate above fifty, UNLESS the text indicates more than one site, since a high rate is implausible for a single center but routine for a multi-center trial. Violations and the recruitment flag set the score.

How it works

Layer 2 (contextual): dates are captured by language-keyed regular expressions and converted to month resolution. Each ordering violation adds 4.0 with an error/warning finding naming the two dates. With no violation, an ethics-to-collection or collection-to-submission gap of zero or one month each add 1.0 (informational). The recruitment rate is the largest reported sample size over the collection duration in months; above fifty per month with no multi-site language it adds 2.0. Total capped at 5.0. Metadata records dates_found, ordering_violations, recruitment_rate, multisite_detected (whether multi-site language was present, which suppresses the recruitment-rate flag), and timeline_gaps_months (the inter-milestone gaps surfaced from the ordering computation).

Why this matters

The chronology of a study is a web of constraints that fabricated or hastily assembled papers often violate, because invented dates are not cross-checked the way real records are. Carlisle's examination of trials with individual-patient data found impossible and inconsistent timelines among the features exposing false data and zombie trials. Reviews of clinical-trial fraud list timeline and recruitment anomalies (enrolling more patients than a site could plausibly see) among recognised markers. The ordering checks encode hard logical requirements: a study cannot collect data before approval, cannot end collection before starting, and cannot be submitted before collection finishes, so a violation is impossible, not merely improbable. The recruitment-rate check encodes a softer single-site capacity bound, suspended for multi-center studies. Registration after collection signals retrospective registration, undermining the safeguard against selective reporting.

Score thresholds

0: The reported dates are consistent and the recruitment rate is plausible.
1-2: Tight but possible gaps, or a single soft concern.
4-5: An impossible ordering of dates, or a wildly implausible recruitment rate.

Limitations

Depends on dates stated in the text in a recognisable month-and-year form and on the extraction patterns associating each date with the right event, so unusual phrasing, day-level dates, or dates in tables can be missed or mis-assigned. It resolves dates only to the month, so same-month orderings are consistent. The recruitment-rate check uses the largest sample size found anywhere, which may not be the enrolled count, and the multi-site suppression depends on the text describing the study as multi-center. Retrospective registration is common and not always misconduct, so that flag is directional. The thresholds (fifty per month, one-month tight-gap window) are heuristic. This reads dates from the narrative text; date-level anomalies within individual-patient records (weekend visit clustering) are handled by D3.