F2Text analysisFingerprintLayer 1 (Deterministic)

Claude Fingerprint

Reports how strongly a text exhibits the lexical and structural habits associated with Claude output: a characteristic hedged, dialectical register, inserted safety framing, residual reasoning markup, and a pronounced em-dash density.

Technical description

F2 scores a document on two components and normalises the sum to the 0 to 5 scale. The lexical component sums weight × occurrences over a per-language dictionary of Claude-associated phrases, each weighted 1 to 5 (rather than and on the other hand at 2, straightforward at 3, to be transparent and let me think through this at 4, I'd be happy to help at 5); a matched phrase that also belongs to the shared cross-model generic set is multiplied by 0.25 first. The structural component adds fixed points for hierarchical Markdown headings (## followed by ###, +2), dialectical on one hand … on the other hand framing (+3), inserted safety or ethics language (+3), and each residual reasoning block in Claude markup (<thinking>…</thinking>, <artifact>…</artifact>, +5 each). Em-dash density is scored on a sliding curve: given at least two em-dashes and a rate r per 1000 words of at least 3, it adds min(2.0 + (r − 3) × 0.3, 4.0). The raw total R maps to the reported score as min(5.0, R / 15 × 5). Twelve language dictionaries are available; the document language selects one.

How it works

The implementation is deterministic and runs at Layer 1 over compiled regular expressions.

Lexical scoring. The active per-language dictionary maps each phrase to an integer weight from 1 to 5, rising with how distinctive the phrase is of Claude output. The weight concentrates on the model's hedged, transparent register: the assistant openers I'd be happy to help (5) and let me think through this (4), the epistemic hedges to be transparent and I'm not entirely certain (4), and the dialectical connectives rather than, on one hand, on the other hand (2). Each case-insensitive match adds its weight times its occurrence count. A phrase that is over-represented across many assistants is held in a shared cross-model set and has its weight multiplied by 0.25 before the addition, so the lexical score reflects the Claude-specific residue rather than the assistant register common to all models. Matches of weight 4 or more are reported at warning severity, lighter matches at informational severity.

Structural scoring. Five signatures contribute. Hierarchical headings, a level-two heading followed by a level-three heading, add 2 and capture the model's habit of nesting structure into chat answers. The balanced-argument frame, a document containing both on one hand and on the other hand, adds 3 and captures the dialectical default of weighing two sides. Inserted safety or ethics language adds 3 and captures unrequested caveats about responsible use. Each residual reasoning block left in Claude's own markup, <thinking> or <artifact>, adds 5, the single strongest signal because such blocks appear only by direct copy from the interface.

Em-dash density. The em-dash is the headline Claude marker and is scored on its own graded curve rather than a flat threshold. The em-dash count is divided by the word count and multiplied by 1000 to give a rate r; once r reaches 3 per 1000 words (with at least two em-dashes present), the contribution is min(2.0 + (r − 3) × 0.3, 4.0), so density just over the threshold contributes 2 and rises by 0.3 for each additional em-dash per 1000 words up to a ceiling of 4.

Aggregation. The lexical sum, the four structural contributions and the em-dash term are added into a raw score R, reported as min(5.0, R / 15 × 5). The raw score and the detected phrases with their counts and effective weights are returned in the metadata.

Score thresholds

Score	Meaning
0 to 1	Sparse Claude vocabulary, ordinary punctuation, no nested headings or reasoning markup.
2 to 3	A concentration of the hedged dialectical register, or one structural signature such as nested headings or a balanced-argument frame, or em-dash density just over the threshold.
4 to 5	The register, several structural signatures and high em-dash density co-occur, or a residual reasoning block is present. Strongly consistent with unedited Claude output.

Why this matters

Claude carries a register shaped by its alignment process. Trained with feedback that rewards transparency, balance and caution, the model hedges openly (to be transparent, I'm not entirely certain), weighs both sides of a question by default (on one hand … on the other hand), and volunteers safety and ethics caveats even when unprompted. Each of these is rare in a finished manuscript, where an author commits to a position rather than narrating deliberation, so their presence is diagnostic. The punctuation signal is sharper still. The model's marked preference for the em-dash has been documented widely enough to acquire a name, and it produces em-dash densities well above those of edited human prose; scoring that density on a graded curve, rather than a single cutoff, lets the indicator separate light stylistic use from the heavy, uniform deployment characteristic of the model. The reasoning-markup signal is categorical: <thinking> and <artifact> blocks exist only inside the model's own interface, so their survival in a submitted document is direct evidence of an unedited copy.

Limitations

The vocabulary and em-dash thresholds were calibrated against 2024-2025 Claude output and require periodic recalibration as the model and its users adapt. The lexical signal is removable by paraphrase or by replacing the highest-weight openers, and the em-dash signal collapses under a single find-and-replace of the character. In the other direction, an author who writes dialectically, hedges carefully, or favours the em-dash can reach a moderate score without machine assistance, which is why the indicator reports resemblance to a profile rather than a decision. The em-dash density measure shares the em-dash count with E1 and is a deliberate, model-specific complement to it: E1 flags the presence of em-dashes as a general typographic anomaly, while F2 weighs their density as a Claude attribution signal. The dictionaries are most developed for English and thin out across the other eleven languages; the structural and em-dash checks are language-independent. Markdown-stripped documents lose the heading and reasoning-markup signatures.

Theoretical background

F2 combines lexical fingerprinting with two model-specific structural strands. The lexical weights rest on the same excess-vocabulary logic used across the F-series: terms whose frequency rose with the model era, restricted here to the hedged, dialectical openers characteristic of Claude rather than the broad assistant register shared with other systems, which is subtracted through the cross-model generic discount. The safety-and-ethics signature traces to the model's training objective: alignment by a written set of principles rewards the insertion of caution and balance, so that caution becomes a stylistic residue detectable at the surface. The em-dash strand follows the observation, now widely reported, that feedback-tuned models converge on a narrow set of typographic habits; the em-dash is the most pronounced of these for Claude, which is why F2 promotes it from a binary marker to a graded, separately-weighted term.

References

Kobak D, González-Márquez R, Horvát EÁ, Lause J. Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances. 2025. https://arxiv.org/abs/2406.07016
Bai Y, Kadavath S, Kundu S, Askell A, Kernion J, Jones A, et al. Constitutional AI: harmlessness from AI feedback. arXiv preprint arXiv:2212.08073. 2022. https://arxiv.org/abs/2212.08073
McGill Office for Science and Society. Why did LLMs steal our em-dashes? 2025. https://www.mcgill.ca/oss/article/critical-thinking-student-contributors-technology/why-did-llms-steal-our-em-dashes
Liang W, Zhang Y, Wu Z, Lepp H, Ji W, Zhao X, Cao H, Liu S, He S, Huang Z, Yang D, Potts C, Manning CD, Zou J. Quantifying large language model usage in scientific papers. Nature Human Behaviour. 2025. DOI: 10.1038/s41562-025-02273-8 https://www.nature.com/articles/s41562-025-02273-8