Terminology
Detects inconsistent use of technical vocabulary, where the same concept is referred to by different terms throughout the text, a sign that the author may not fully understand the terminology.
Technical description
Builds a terminology graph by extracting technical noun phrases and mapping synonymous terms. Uses POS-tagged noun phrases and domain-specific dictionaries to identify when the same concept is referred to inconsistently (e.g., switching between 'participants' and 'subjects' and 'respondents' without clear reason). Measures terminology consistency as the ratio of unique terms to unique concepts.
How it works
Layer 2 (NLP): Extracts technical noun phrases using POS tagging. Groups synonymous terms using dictionary-based matching. Measures how many different terms are used for each concept. Flags concepts with 3+ synonym variants used interchangeably. Computes an overall terminology consistency score.
Why this matters
Expert human authors in a specific field use terminology consistently because they have internalized the precise meanings. AI models draw from diverse training data and may mix terminology from different subfields or use synonyms interchangeably. This inconsistency reveals a lack of genuine domain expertise that is characteristic of machine-generated text.
Score thresholds
- 0-1
- Consistent technical vocabulary throughout
- 2-3
- Occasional synonym switching for key terms
- 4-5
- Frequent inconsistent use of interchangeable technical terms
Limitations
Interdisciplinary papers may legitimately use terminology from multiple fields. Review papers may use different terms when citing different sources. Some style guides require varying word choice to avoid repetition.
References
- Kobak D, Gonzalez-Marquez R, Horvat E-A, Lause J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances
- Munoz-Ortiz A, Gomez-Rodriguez C, Vilares D. (2024). Contrasting linguistic patterns in human and LLM-generated news text. Artificial Intelligence Review
- Heinisch B. (2025). Large language models for terminology work: a question of the right prompt?. Journal for Language Technology and Computational Linguistics
- Leppanen L, et al.. (2025). How large language models are changing MOOC essay answers: a comparison of pre- and post-LLM responses
- Schmalz V, Tack A. (2025). Can GPTZero AI vocabulary distinguish between LLM-generated and student-written essays?. BEA Workshop
- Shaib C, Li Y, Tetreault J, Jaimes A. (2025). Measuring AI slop in text. arXiv:2509.19163
- Jarvis S, et al.. (2025). Lexical diversity analysis across ChatGPT versions vs. humans. arXiv:2508.00086