Frequency Analysis
Examines the image's hidden frequency patterns to detect synthetic or AI-generated images, which often have distinctive repetitive patterns invisible to the naked eye.
Technical description
Computes the 2-D Fourier transform of the grayscale image, forms the power spectrum |F|^2, and reduces it to a 1-D azimuthally-averaged profile P(f) over 32 radial frequency rings. Natural images follow a smooth 1/f power law, so log P(f) is fit against log f to give the decay slope a (exponent) and the fit R-squared. Three signals are summed: a decay slope outside the natural band [-4.0, -1.0] adds up to 2.0; a fit R-squared below 0.80, a bumpy or peaky spectrum that departs from a power law, adds up to 1.5; and periodic peaks (cells above mean + 4 standard deviations in the mid-to-high band, with the central low-frequency blob excluded) add up to 1.5. The score is capped at 5.0.
How it works
Layer 1 (deterministic). Applies the 2-D FFT, forms the power spectrum, and azimuthally averages it into a radial profile. Fits a power law in log-log axes to obtain the decay slope and R-squared, and detects periodic peaks in the log-magnitude after excluding the central low-frequency blob. Scores the slope's departure from the natural band, the spectrum's departure from a power law, and the peak count, and reports the slope, R-squared, and peak count.
Why this matters
The convolutional up-sampling at the heart of GANs and diffusion models cannot reproduce the spectral distribution of natural images, leaving two traces: a distorted high-frequency decay and a regular grid of peaks at the up-sampling frequency, consistent across architectures and resolutions, and present for GANs, diffusion models, and VQ-GANs alike. Natural photographs are approximately scale-invariant, so their azimuthal power spectrum follows 1/f^a closely; measuring the departure from that law and counting the up-sampling peaks turns the frequency domain into a model-free synthetic-image screen.
Score thresholds
- 0-1
- The power spectrum follows a smooth natural 1/f law with no periodic peaks
- 2-3
- One signal: a decay outside the natural band, a departure from a power law, or periodic peaks
- 4-5
- Several signals together, consistent with the up-sampling fingerprint of a generative model
Limitations
A screen, not proof. The high-frequency spectral discrepancy can be reduced by minor changes to the generator, so a spectrally-consistent model evades the test, and ordinary downscaling, blurring, or recompression reshapes the spectrum. The natural-spectrum band and conformity threshold are directional; authentic but unusual content (fine repetitive textures, halftone prints, heavy noise) can move the slope or add peaks. The test reads a global spectrum, so a small synthetic insert in a real image is diluted. Localized editing, compression-history, noise consistency, and learned generator fingerprints live in sibling indicators.
References
- Durall R, Keuper M, Keuper J. (2020). Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions. IEEE/CVF CVPR 2020 (arXiv:2003.01826)
- Frank J, Eisenhofer T, Schönherr L, Fischer A, Kolossa D, Holz T. (2020). Leveraging Frequency Analysis for Deep Fake Image Recognition. ICML 2020 (arXiv:2003.08685)
- Corvi R, Cozzolino D, Poggi G, Nagano K, Verdoliva L. (2023). Intriguing properties of synthetic images: from generative adversarial networks to diffusion models. IEEE/CVF CVPR Workshops 2023 (arXiv:2304.06408)
- Chandrasegaran K, Tran NT, Cheung NM. (2021). A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection. IEEE/CVF CVPR 2021 (arXiv:2103.17195)