ResAIKit
Research Integrity Toolkit
Back to the encyclopedia
E6Text analysisForensicLayer 2 (Contextual)

Rhythm Variation

Detects unnaturally uniform sentence rhythm across five dimensions: length coefficient of variation, kurtosis, burstiness, repetitive openings, and section-level rhythm uniformity. AI-generated text tends toward suspiciously consistent sentence lengths while human writing shows characteristic variability.

Technical description

E6 analyses sentence-length distribution using four statistical measures plus one structural check. The core insight is that human writing varies sentence length naturally in response to content demands, short sentences for emphasis, long sentences for explanation, while large language model (LLM) output maintains a more uniform rhythm across the document. The indicator runs at Layer 1 using only sentence splitting and word counting.

How it works

1. Sentence-length coefficient of variation (CV). The CV (std / mean) of sentence lengths in words is computed across all sentences. A CV below 0.25 indicates suspiciously uniform sentence lengths (+1.5). This is the single most predictive feature for AI text detection: Joseph et al. (2025) found average sentence length outperformed perplexity as the #1 feature (importance 0.16 vs 0.13) in a 45-feature Random Forest achieving F1=0.95 across GPT-4 and Claude 3 outputs.

2. Kurtosis. The excess kurtosis of the sentence-length distribution is computed. A negative kurtosis (platykurtic distribution) means sentence lengths cluster too tightly around the mean, with few very short or very long sentences. Human text is typically leptokurtic (positive excess kurtosis) due to occasional very short and very long sentences. Platykurtic distribution contributes +1.0.

3. Adjacent-sentence burstiness. The mean absolute difference in word count between consecutive sentences is computed. A mean difference below 5 words indicates that adjacent sentences are too similar in length (+1.0). Gerus-lab (2025) measured AI-generated sentence-length standard deviation at ~2.1 words versus ~8.7 for human writing, the "cardiac flatline" pattern where every sentence lands between 14-18 words with no rhythmic variation.

4. Repetitive sentence openings. The first word of each sentence is extracted and runs of identical openings are counted. More than four consecutive sentences starting with the same word (commonly "The") fires (+1.0). This catches the LLM pattern of beginning every sentence with the same structural formula.

5. Section-level rhythm uniformity. The text is partitioned into IMRaD (Introduction, Methods, Results and Discussion) sections. The sentence-length CV is computed independently for each section with at least four sentences. If three or more sections have nearly identical CVs (std of CVs below 0.05), the sub-check fires (+0.5). Human authors vary rhythm between Methods (typically shorter sentences), Results (mixed), and Discussion (longer analytical sentences). Flat section-level CVs are a template-generation signal.

Score thresholds

Score Meaning
0 to 1 Natural sentence rhythm: varied lengths, leptokurtic distribution, adjacent burstiness, diverse openings. Typical of human academic prose.
2 to 3 Moderate rhythm uniformity: one or two dimensions show AI-typical flatness. Common in heavily edited text and some AI-assisted drafts.
4 to 5 Severe rhythm uniformity: most dimensions collapse simultaneously. Sentence lengths are suspiciously uniform with low burstiness and repetitive structure. Highly consistent with LLM output.

Limitations

Sentence segmentation uses a simple splitter that fires on ., !, ? followed by whitespace. Abbreviations containing periods (e.g., "Dr.", "e.g.", "et al.") may produce spurious sentence boundaries, inflating the sentence count and distorting length statistics. The indicator requires a minimum of four valid sentences to produce results.

Kurtosis is unstable on small samples. Documents with fewer than 20 sentences may produce spurious kurtosis values. The platykurtosis check is conservative and fires only when n >= 4 and std > 0.

Section-level rhythm uniformity requires the IMRaD classifier to recognise at least three sections with four or more sentences each. Documents without standard IMRaD headings skip this sub-check.

References

  1. Joseph E, Bennet K, Kingsley F. Feature-based detection of AI-generated text: an analysis of stylometric and perplexity markers in contemporary large language models. SSRN. 2025. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5833302
  2. Gerus-lab. Your AI-generated content is fooling nobody, and we have the data to prove it. 2025.
  3. Reinhart A, et al. Linguistic markers of AI-generated text and their detectability. PNAS. 2025.
  4. Pudasaini S, et al. Systematic benchmark of linguistic features for AI text detection across domains and generators. arXiv preprint. 2026.
  5. Mendenhall TC. The characteristic curves of composition. Science. 1887;9(214S):237-246.
  6. Mosteller F, Wallace DL. Inference and Disputed Authorship: The Federalist. Reading, MA: Addison-Wesley; 1964.