PASSAGE
Detect repeated long n-gram phrase reuse.
Behavior¶
| Result | Input text | Why |
|---|---|---|
| Flag | Repeating "at the end of the day" many times in one document. | Same phrase recurs instead of varied expression. |
| Flag | Reusing a 4-8 word clause across multiple paragraphs. | Long repeated n-grams suggest copy-pattern generation. |
| Pass | Repeating short stopword-heavy fragments like "in the end". | Common function phrases are filtered or suppressed. |
| Pass | Using related ideas with different wording across sections. | Semantic repetition without lexical cloning is acceptable. |
Severity¶
Medium to high; repeated long phrases are strong formulaicity signals.
Default configuration¶
penalty-1
record_cap5
repeated_ngram_max_n8
repeated_ngram_min_count3
repeated_ngram_min_n4