Voice Variation
Detects when a document maintains an unnaturally uniform authorial voice throughout, lacking the subtle shifts in tone that occur when a human writes different sections at different times.
Technical description
Uses NLP-based part-of-speech tagging to build a stylistic profile per section (introduction, methods, results, discussion). Measures variation in: sentence complexity (subordinate clause frequency), passive voice ratio, first-person pronoun usage, hedging frequency, and vocabulary sophistication. Computes the coefficient of variation across sections, low variation indicates a synthetic, uniform voice.
How it works
Layer 2 (NLP): Applies spaCy POS tagging to each document section. Extracts per-section features: passive voice ratio, hedge word density, subordinate clause depth, vocabulary richness (type-token ratio). Computes cross-section variance for each feature. Low variance across multiple features simultaneously flags AI generation.
Why this matters
Human authors naturally shift their writing style across paper sections, methods tend to be more passive and procedural, discussions more speculative and hedged. AI generates text with remarkably consistent stylistic features throughout, as it optimizes for a single coherent voice. This uniformity is detectable through statistical analysis of linguistic features across sections.
Score thresholds
- 0-1
- Natural voice variation across document sections
- 2-3
- Moderate uniformity with some stylistic shifts
- 4-5
- Unnaturally uniform voice throughout all sections
Limitations
Single-section documents (abstracts, short reports) cannot be meaningfully analyzed. Multi-author papers may show artificial variation due to different writing styles. Documents written in a single focused session may naturally have more uniform voice.
References
- Markowitz DM, Hancock JT, Bailenson JN. (2024). Linguistic markers of inherently false AI communication and intentionally false human communication: evidence from hotel reviews. Journal of Language and Social Psychology
- Yin Z, Wang S. (2025). Span-level detection of AI-generated scientific text via contrastive learning and structural calibration. arXiv:2510.00890