ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
M3Image forensicsMicroscopyLayer 1 (Deterministic)

Inpainting Detection

Detects regions that have been filled, erased, or painted over. Inpainting synthesises a patch by smooth interpolation or copied texture, which suppresses the fine sensor noise a genuine capture carries everywhere, so a retouched block shows an abnormally low noise floor and often a flat local histogram.

Technical description

Decomposes the grayscale image with a single-level Haar wavelet and estimates the per-block noise floor on the diagonal detail band with the robust median absolute deviation sigma = median(|cD - median|) / 0.6745, which tracks noise while ignoring the strong coefficients that edges produce. It reads three signals: wavelet outlier blocks whose floor departs from the mean by more than one standard deviation, critical blocks whose floor nearly vanishes (sigma below 0.05 of the mean, gated on mean above 2.0), and range-suspicious blocks whose 5th-to-95th-percentile intensity range is below 0.15 of the median range (gated on median above 20). The collapse of the local noise floor is the inpainting signature.

How it works

Layer 1 (deterministic): takes the diagonal Haar wavelet band, estimates each block's noise floor with the median absolute deviation, and flags blocks that are outliers, that have a collapsed floor, or that have an unnaturally narrow intensity range. The wavelet score is min(4, 8 * CV + 5 * outlier_fraction), the range score is min(3, 20 * range_suspicious_fraction), and a critical-block bonus up to 2.0 is added to the larger of the two, capped at 5.0. Findings carry block bounding boxes, critical fills first.

Why this matters

Erasing or painting in a feature is among the most damaging manipulations because, unlike duplication, it leaves no second copy to compare against, and journals treat adjustments that obscure or fabricate content as misconduct even when locally seamless. The forensic handle is that synthesis cannot reproduce the sensor noise floor: diffusion-based fill solves a smoothing equation and leaves almost no high-frequency energy, and copied texture lacks freshly sampled noise, so a filled region reads as a hole in the otherwise uniform noise floor.

Score thresholds

0-1
Noise floor and local contrast are uniform, consistent with a single unedited capture
2-3
Some blocks deviate in noise level or local range, a possible local edit or natural texture variation
4-5
One or more blocks have a collapsed noise floor or a flat histogram, consistent with a filled, erased, or inpainted region

Limitations

A suppressed noise floor is necessary but not unique to inpainting: smooth content such as a saturated highlight, a uniform background, or an out-of-focus region carries little high-frequency energy, which is why both signals are gated on the image having a real noise floor and contrast. Whole-image denoising flattens the floor everywhere, and JPEG compression can both hide and mimic a fill along its grid. The 16-pixel block bounds the resolution, so a smaller fill is averaged away. M3 performs no error level analysis (indicator I1) and tests no tonal response continuity (indicator I9); it differs from the global noise-consistency indicator I3 by looking for a one-sided local collapse of the noise floor rather than the overall spread.

References

  1. Li H, Luo W, Huang J. (2017). Localization of Diffusion-Based Inpainting in Digital Images. IEEE Transactions on Information Forensics and Security 12(12):3050-3064
  2. Mahdian B, Saic S. (2009). Using noise inconsistencies for blind image forensics. Image and Vision Computing 27(10):1497-1503
  3. Donoho DL, Johnstone IM. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425-455
  4. Rossner M, Yamada KM. (2004). What's in a picture? The temptation of image manipulation. The Journal of Cell Biology 166(1):11-15