Depth of Field
Maps local focus across a microscopy image and reads two physically-grounded anomalies. A single optical capture has a depth of field, so its sharpness falls off smoothly away from the focal plane. A fully synthetic micrograph instead tends to be uniformly sharp everywhere, and a composite stitched from different focal planes shows two distinct focus populations separated by a sharp spatial focus boundary. The indicator scores the absence of depth of field and the bimodal-plus-seam composite signature. It works on the pixels alone, with no model.
Technical description
M1 is a deterministic, generator-agnostic screen built on the physics of optical depth of field. A real microscope images a thin focal plane, and the optical transfer function blurs structure that lies away from that plane, so a genuine single capture shows a focus map that either is roughly uniform (a thin, flat specimen wholly in focus) or falls off smoothly from an in-focus region. Two fabrication modes break this prior in opposite ways. A fully synthetic micrograph rendered without an optical system tends to carry the same high sharpness across the whole frame, producing a focus map with almost no variation. A composite assembled from captures at different focal planes, or from different sources, places regions of clearly different sharpness next to one another, producing two focus populations joined at a sharp spatial seam that smooth defocus cannot create.
The indicator estimates local focus with the variance of the Laplacian, the focus operator that comparative studies of microscopy autofocus rank among the most reliable, and reads three quantities from the resulting focus map: its coefficient of variation, its bimodality, and the sharpness of its strongest spatial boundary. The image must be at least 64 by 64 pixels, and the focus map must contain at least four cells, or the indicator returns a zero score and records that it was skipped.
How it works
The image is converted to grayscale and tiled into overlapping windows of 32 by 32 pixels at a stride of 16. Over each window the Laplacian is computed and its variance is taken as the local focus measure, focus = Var(nabla^2 I). This is the variance-of-Laplacian operator: the Laplacian is a high-pass filter, so in-focus structure with sharp edges yields large positive and negative responses and a high variance, while defocused structure is smoothed and yields a low variance. The window responses are stored in a row-major grid, so they reshape to a focus map of n_rows by n_cols cells.
Three statistics summarize the map. First, the coefficient of variation CV = s / mean, with s the standard deviation of the per-window focus values, measures how much focus varies across the frame. A CV below 0.25 marks an image that is uniformly sharp, the synthetic signature, and adds 2.0 to the score.
Second, the bimodality coefficient of Pfister and colleagues tests whether the focus values form two populations:
BC = (g1^2 + 1) / (g2 + 3 (n - 1)^2 / ((n - 2)(n - 3)))
where g1 is the sample skewness, g2 the sample excess kurtosis, and n the number of focus cells. The value 5/9 is what a uniform distribution produces, so BC > 5/9 indicates a bimodal focus distribution. Bimodality is the standard closed-form screen and is far more selective than reading kurtosis alone, because a merely flat or platykurtic distribution does not cross the threshold.
Third, the spatial boundary sharpness measures whether the two focus populations are joined by a real seam. The map is passed through log1p to compress its heavy tail, the absolute first differences between horizontally and vertically adjacent cells are taken, and the largest such step is compared to the focus range, defined as the 95th minus the 5th percentile of the log focus:
boundary_sharpness = max_step / focus_range
Smooth optical defocus spreads the range over many cells, so each step is roughly range / k for a gradient k cells wide, giving a small ratio; a composite seam concentrates the whole change into one step, giving a ratio near one. The ratio is trusted only when the focus range exceeds 0.5 in log units, because a near-flat map would otherwise divide a small numerator by a near-zero range and report a spurious boundary.
The composite signature is scored from the combination. A bimodal map joined by a sharp boundary (boundary_sharpness > 0.5) adds 2.0 and is the strongest evidence of a focal-plane composite. A bimodal map without a sharp boundary adds 1.0, since a thick specimen or a smoothly blended composite can produce two focus populations without a seam. A sharp boundary without bimodality adds 1.5. The contributions are summed and capped at 5.0. The metadata records the focus CV, the bimodality flag and coefficient, the boundary sharpness, the focus range, and the excess kurtosis.
Score thresholds
| Score | Meaning |
|---|---|
| 0 to 1 | Focus varies smoothly across the frame, consistent with a single optical capture. |
| 2 to 3 | One strong anomaly: either a uniformly sharp frame with no depth of field, or two focus populations without a clear seam. |
| 4 to 5 | Two focus populations joined by a sharp spatial boundary, or a uniformly sharp frame combined with a focus discontinuity. Consistent with a synthetic image or a composite of different focal planes. |
Why this matters
Depth of field is a hard physical constraint on optical microscopy: the objective resolves a thin focal plane and blurs everything else, so genuine images carry a characteristic focus structure that fabrication tends to violate. Quantifying focus from a single image is a solved problem in microscopy autofocus, and the variance of the Laplacian that M1 uses is one of the operators that comparative studies single out as reliable for brightfield microscopy [1] and rank near the top across a wide operator survey [2]. Reading the focus map for fabrication adapts that machinery to integrity screening. A composite built from different focal planes is a recognized manipulation, and the cue M1 reads for it, an abrupt change in local blur where smooth defocus is expected, is the same partial-blur inconsistency that splicing-localization methods exploit to expose a pasted region whose blur does not match its surroundings [5]. The bimodality test that gates the composite cue is the closed-form bimodality coefficient, whose behavior and the canonical 5/9 threshold are documented by Pfister and colleagues [4], while the rigorous reference test for the same question is Hartigan's dip statistic [3]. Together these let M1 separate a genuine depth-of-field falloff from the two focus signatures that capture cannot produce.
Limitations
Depth of field is informative only when the specimen and modality permit it. Z-stack projections and extended-depth-of-focus images intentionally combine focal planes, so they legitimately show uniform sharpness or multiple focus populations and can raise the score without fabrication. Thick specimens, phase contrast, and differential interference contrast carry inherent focus effects, and a thin, flat, fully in-focus section genuinely has little focus variation, which the uniform-sharpness cue can read as synthetic. The boundary measure is gated to ignore near-flat maps, but a strong real texture edge that coincides with a focus change can still inflate it. The 32-pixel window and 16-pixel stride bound the spatial resolution, so a small inserted region is averaged with its surroundings. The thresholds are directional rather than exact. Region reuse within or across panels, generic edge-sharpness uniformity, and frequency fingerprints live in sibling indicators, so M1 stays on the focus map.
Theoretical background
M1 rests on Fourier optics. A microscope objective acts as a low-pass filter whose point spread function widens with distance from the focal plane, so out-of-focus structure loses its high spatial frequencies and its local high-pass energy drops. The variance of the Laplacian is a direct estimate of that high-pass energy, which is why it tracks focus and why it is a standard autofocus criterion. A genuine capture therefore yields a focus map governed by the optics: uniform when the specimen lies in one plane, or smoothly graded when it does not, but never abruptly discontinuous, because the point spread function changes continuously across the field. Synthetic rendering has no such constraint and can hold sharpness constant everywhere, and a composite of different captures places focus levels side by side that no single point spread function would produce, leaving a seam. The bimodality coefficient formalizes the two-population question with the third and fourth sample moments, and the boundary ratio localizes the change in space, so M1 reads a violation of optical physics rather than a learned fingerprint of any one generator.
References
- Pech-Pacheco JL, Cristóbal G, Chamorro-Martínez J, Fernández-Valdivia J. Diatom autofocusing in brightfield microscopy: a comparative study. Proceedings 15th International Conference on Pattern Recognition (ICPR-2000). 2000;3:314-317. DOI: 10.1109/ICPR.2000.903548
- Pertuz S, Puig D, Garcia MA. Analysis of focus measure operators for shape-from-focus. Pattern Recognition. 2013;46(5):1415-1432. DOI: 10.1016/j.patcog.2012.11.011
- Hartigan JA, Hartigan PM. The Dip Test of Unimodality. The Annals of Statistics. 1985;13(1):70-84. DOI: 10.1214/aos/1176346577
- Pfister R, Schwarz KA, Janczyk M, Dale R, Freeman JB. Good things peak in pairs: a note on the bimodality coefficient. Frontiers in Psychology. 2013;4:700. DOI: 10.3389/fpsyg.2013.00700
- Bahrami K, Kot AC, Li L, Li H. Blurred Image Splicing Localization by Exposing Blur Type Inconsistency. IEEE Transactions on Information Forensics and Security. 2015;10(5):999-1009. DOI: 10.1109/TIFS.2015.2394231