Perfection Paradox
Flags text that is suspiciously error-free. Human writing carries minor imperfections (typos, formatting inconsistencies, acronym capitalization variants, natural contraction patterns). AI-generated text is machine-perfect, and that perfection is itself the signal (Penn, 2026; MDPI, 2026).
Technical description
E4 detects the "perfection paradox" across five dimensions. The core insight is asymmetric: AI text is generated directly in clean form with no mechanical errors, while human text accumulates imperfections through typing, editing, and multi-tool processing. The indicator runs at Layer 2 because it depends on a spell-checker and a part-of-speech tagger.
How it works
1. Spell-check. Tokens are filtered by part-of-speech to exclude proper nouns, numbers, punctuation, and symbols. The remaining alphabetic tokens are checked against spelling dictionaries for English and Romanian. Zero spelling errors on texts over 2000 words contributes +1.5; over 500 words contributes +0.5. The 2026 Multidisciplinary Digital Publishing Institute (MDPI) study of university professors ranked "absence of language errors" as the #1 feature used to identify AI-generated text.
2. Decimal formatting consistency. Extracts all numbers with decimal separators from the text. If more than 20 decimal numbers exist and 100% use the same separator (all dots or all commas), contributes +0.5. Human text, especially when assembled from multiple sources, often mixes decimal conventions.
3. Abbreviation consistency. On texts over 2000 words, checks for co-occurrence of abbreviated and spelled-out forms of common terms (Fig./Figure, Tab./Table). Perfect consistency (only abbreviated or only spelled-out, never both) contributes +0.5.
4. Acronym capitalization consistency. Extracts all-caps words of 3+ characters as potential acronyms. If 5+ distinct acronyms exist and none appears in lowercase form elsewhere, contributes +0.5. Human text typically has occasional capitalization slip-ups; AI capitalization is machine-consistent.
5. Contraction consistency. On English texts over 1000 words, checks the distribution of 30+ common contractions. Two patterns trigger: zero contractions on texts over 2000 words (AI default formal register), or excessive contraction density (15+ distinct forms on texts over 1000 words, AI over-correction when prompted to "be casual"). Each contributes +0.5.
Score thresholds
| Score | Meaning |
|---|---|
| 0 to 1 | Normal imperfection profile: some spelling errors, formatting variation, abbreviation mixing, natural contraction patterns. Consistent with human writing. |
| 2 to 3 | Moderate perfection signals: zero spelling errors on medium-length text, consistent formatting, or uniform contractions. |
| 4 to 5 | Strong perfection paradox: zero spelling errors on long text plus multiple consistency signals (formatting, abbreviations, acronyms, contractions). The text is "too perfect", "not a single comma out of place" (MDPI professor study, 2026). |
Limitations
Spell-check depends on spelling-dictionary availability. Without the required dictionaries, sub-check 1 silently degrades to zero contribution. Proper nouns filtered by the part-of-speech tagger may include genuine misspellings of common words if the tagger misclassifies them.
Acronym capitalization consistency (sub-check 4) uses a simple all-caps pattern matching heuristic that may include non-acronym uppercase words (e.g., section headings in uppercase) and exclude acronyms shorter than 3 characters. The 5-acronym minimum gate reduces false positives from short texts.
Contraction consistency (sub-check 5) is English-only. Romanian uses different contraction mechanisms (hyphenated verb+pronoun forms) that are not currently modelled. The sub-check is skipped for Romanian.
The perfection paradox is inherently statistical: a well-edited human document can legitimately score highly on multiple sub-checks. The indicator is designed as a heuristic signal to be combined with other indicators, not as a standalone classifier.
References
- Penn CS. The hidden flaw in AI-generated content: why perfectly flawless results should make you suspicious. 2026. https://www.christopherspenn.com/2026/01/the-hidden-flaw-in-ai-generated-content-why-perfectly-flawless-results-should-make-you-suspicious/
- MDPI. Key features to distinguish between human- and AI-generated texts: perspectives from university professors. MDPI Publications. 2026;2(1):2. https://www.mdpi.com/3042-8130/2/1/2
- Vegavid. Why does my writing get flagged as AI? Causes and fixes. 2026. https://vegavid.com/blog/why-writing-flagged-as-ai