Skip to content

Same-document paragraph removal too eager #14

@DavidNemeskey

Description

@DavidNemeskey

remove_same_p.py removes duplicate paragraphs in documents. It is useful when the same content is included twice in an HTML page for technical reasons, such as once for static and once for dynamic presentation, but detrimental to document cohesion when repetition occurs naturally in the text.

Fix the latter case, e.g. only remove paragraphs if all (or most) of the them occur twice in the document.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions