ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
W1Image forensicsWestern BlotLayer 1 (Deterministic)

Duplicate Bands

Finds protein bands in a western blot that are copies of one another. Reusing one band for different lanes is a common fabrication, so the indicator detects every band, compares each pair by structural similarity across flips and rotations, and flags close matches, comparing only bands of compatible size because a copied band keeps its dimensions.

Technical description

Detects bands by adaptive thresholding and contour analysis, extracts a patch around each, and compares every pair with the structural similarity index (SSIM) over four orientations (original, horizontal flip, 90 and 180 degree rotation). A size-compatibility gate first requires the two bands to match in width and height within 30 percent, or with the dimensions swapped for a 90-degree copy, since a paste preserves size and resizing dissimilar bands would manufacture similarity. A best SSIM above 0.85 is a duplicate pair. The pair count sets the score (up to 5.0).

How it works

Layer 1 (deterministic): detects bands, then for each size-compatible pair computes the best SSIM over four orientations (resizing to a common shape). A best SSIM above 0.85 records a duplicate pair. No pairs score 0; a single pair scores 1.5 plus ten times its excess over the threshold; more than one scores 2.5 plus 0.5 per pair, both capped at 5.0. Each duplicate marks the two band locations, critical above 0.95.

Why this matters

Band reuse is one of the most documented forms of figure manipulation in the life sciences: large surveys of biomedical figures find inappropriate duplication, including copied bands and panels, to be the dominant category of problematic images. Detecting it needs a similarity measure that matches perception of whether two bands are the same, which is what SSIM provides by comparing local luminance, contrast, and structure. The size-compatibility requirement reflects how copying works, since a pasted band keeps its dimensions.

Score thresholds

0-1
No bands are duplicates of one another
2-3
One band pair matches closely, a possible reuse or two genuinely similar bands
4-5
A near-perfect band match, or several duplicate pairs, consistent with a band copied to fabricate lanes

Limitations

Band detection is the first dependency: faint, smeared, or overlapping bands can be missed or merged. SSIM behaves poorly on near-uniform patches, so a clean low-texture band carries little structure and a genuine copy of it can be missed, while a textured copy is caught reliably. The size gate prevents matching dissimilar bands but skips a strongly rescaled copy. Resizing orientations to a common shape adds interpolation. The four orientations cover flips and right-angle rotations, not arbitrary small rotations. Generic copy-move is indicator I6 and panel reuse is indicator M2; W1 specialises in band-to-band duplication within a blot.

References

  1. Bik EM, Casadevall A, Fang FC. (2016). The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications. mBio 7(3):e00809-16
  2. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13(4):600-612
  3. Cromey DW. (2010). Avoiding twisted pixels: ethical guidelines for the appropriate use and manipulation of scientific digital images. Science and Engineering Ethics 16(4):639-667