Skip to content
PASSAGE

Phrase Reuse

source

Detect repeated long n-gram phrase reuse.

Class
PhraseReuseRule
Rule name
phrase_reuse
Count key
phrase_reuse

Behavior

ResultInput textWhy
FlagRepeating "at the end of the day" many times in one document.Same phrase recurs instead of varied expression.
FlagReusing a 4-8 word clause across multiple paragraphs.Long repeated n-grams suggest copy-pattern generation.
PassRepeating short stopword-heavy fragments like "in the end".Common function phrases are filtered or suppressed.
PassUsing related ideas with different wording across sections.Semantic repetition without lexical cloning is acceptable.

Severity

Medium to high; repeated long phrases are strong formulaicity signals.

Default configuration

penalty-1
record_cap5
repeated_ngram_max_n8
repeated_ngram_min_count3
repeated_ngram_min_n4

Contributors