L4-VoiceText analysisLLMLayer 4

Voice Analysis

Uses a large language model to analyze whether the authorial voice feels consistent and genuinely human, detecting the kind of subtly artificial quality that rule-based methods may miss.

Technical description

Sends text passages to an LLM (Claude, OpenAI, or Ollama) with a carefully crafted prompt asking it to evaluate: authorial voice authenticity, consistency of writing maturity, presence of genuine stylistic idiosyncrasies, and whether the text reads as if written by a single human author versus generated by a machine. The LLM provides a structured assessment with specific examples and confidence ratings.

How it works

Layer 4 (LLM-powered): Sends the text to a language model with a rubric of three dimensions, judged independently, because voice can fail in opposite ways: flat voice (an unnaturally even, average voice lacking human idiosyncrasy), shifts (abrupt changes in formality or expertise and signs of mixed human-machine authorship, judged by degree rather than hard boundaries), and authorial stance (whether a real position and engagement come through, or only an impersonal surface). The model returns a sub-score and flagged passages per dimension, is told to abstain rather than guess, and low-confidence flags are dropped. Sub-scores combine into one voice score with the breakdown kept alongside. Runs only when a model is configured.

Why this matters

Voice is where machine writing is most fluent and least itself. Model outputs cluster tightly in stylometric terms, defaulting to a standardized, average profile, and they underuse the hedges, emphasis, and engagement that build a human authorial voice, leaving a smooth but impersonal surface. The opposite failure appears in collaboration: a human draft polished by a model, or the reverse, can shift voice mid-document. Counting variance catches some of this; whether a real person comes through needs reading.

Score thresholds

0-1: Strong, distinctive authorial voice detected
2-3: Voice is present but somewhat generic
4-5: Voice feels artificial or inconsistently human

Limitations

Requires a configured LLM provider (adds 30-60s latency). Quality depends on the evaluating model's capabilities. The LLM may have biases in what it considers 'authentic' voice. Results are not fully reproducible due to LLM sampling randomness.