Skip to content

Conversation

@Iamsdt
Copy link
Contributor

@Iamsdt Iamsdt commented Oct 30, 2025

This pull request updates the documentation and dependencies to switch the RTF file handler from pyrtf-ng to striprtf across the project. It also improves error handling for the DOC handler by detecting missing LibreOffice dependencies when using antiword. These changes ensure more reliable RTF extraction and clearer installation instructions.

Dependency and Documentation Updates:

  • Changed the RTF handler dependency from pyrtf-ng to striprtf in pyproject.toml, and updated all related documentation files (docs/api.md, docs/index.md, docs/installation.md, docs/usage.md) to reflect this new requirement. [1] [2] [3] [4] [5]
  • Updated log messages in textxtract/core/registry.py to reference striprtf instead of pyrtf-ng when the RTF handler is not installed.

Error Handling Improvements:

  • Enhanced the DOC handler in textxtract/handlers/doc.py to detect missing LibreOffice dependencies when using antiword, and raise a FileNotFoundError to trigger fallback extraction methods.

Minor Codebase Cleanup:

  • Removed an unused import of Path in textxtract/core/registry.py.

Python Version Documentation:

  • Updated the tested Python versions in docs/installation.md to include Python 3.13.

@Iamsdt Iamsdt merged commit 3dcb25e into main Oct 30, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants