Skip to content

Add epub rename tool#36

Merged
kzuraw merged 4 commits intomainfrom
claude/add-epub-rename-tool-Y4V7K
Jan 16, 2026
Merged

Add epub rename tool#36
kzuraw merged 4 commits intomainfrom
claude/add-epub-rename-tool-Y4V7K

Conversation

@kzuraw
Copy link
Owner

@kzuraw kzuraw commented Jan 16, 2026

New Python script to rename epub files to 'Author - Title.epub' format using metadata extracted from the epub file.

New Python script to rename epub files to 'Author - Title.epub' format
using metadata extracted from the epub file.
@claude
Copy link

claude bot commented Jan 16, 2026

Code Review: Add epub rename tool

Overall, this is a solid addition that follows the repository conventions well. The code is clean and functional, but there are several issues to address:

Critical Issues

1. File Collision Handling (python/rename_epubs.py:82)
The script doesn't check if the target filename already exists before renaming. This could cause unintended overwrites or runtime errors.

if new_path.exists():
    click.echo(f"Skipping {epub_path.name}: target file already exists: {new_name}")
    continue

2. Broad Exception Handling (python/rename_epubs.py:42)
The except Exception: block silently swallows all errors, making debugging difficult. This hides issues like corrupted epub files, permission errors, or malformed XML.

Recommended fix:

except (zipfile.BadZipFile, ET.ParseError, KeyError) as e:
    click.echo(f"Error reading {epub_path.name}: {e}", err=True)
    return None, None
except Exception as e:
    click.echo(f"Unexpected error reading {epub_path.name}: {e}", err=True)
    return None, None

Code Quality Issues

3. Potential None Value (python/rename_epubs.py:28)
opf_path = rootfile.get("full-path") could return None, which would cause zf.read(opf_path) to fail on line 31.

Add validation:

opf_path = rootfile.get("full-path")
if not opf_path:
    return None, None

4. Python Version Requirement (python/rename_epubs.py:2)
The script requires Python 3.14, which doesn't exist yet (latest stable is 3.13). This should be >=3.12 to match the other script in the repo.

5. Inconsistent Sanitization (python/rename_epubs.py:46-51)
The sanitize_filename function doesn't handle consecutive spaces well, which could result in filenames with multiple spaces. Consider using regex like the invoice script does.

Style/Consistency Issues

6. Missing Summary Statistics
Unlike rename_invoices.py which reports counts, this script doesn't provide a summary. Consider adding:

click.echo(f"\nProcessed {len(epubs)} file(s): {renamed_count} renamed, {skipped_count} skipped")

7. Inconsistent Quotes
The repo uses single quotes in rename_invoices.py but this PR uses double quotes. Consider maintaining consistency.

Positive Aspects

  • Correctly follows PEP 723 inline dependency specification
  • Good use of type hints
  • Proper use of Path objects
  • Dry-run functionality works well
  • README documentation is clear and follows existing pattern
  • Proper XML namespace handling for epub metadata

Security Considerations

The script is generally safe for local file operations. The sanitization prevents directory traversal attacks, though it could be more robust.

Recommendations

  1. Fix the file collision issue (critical)
  2. Improve error handling specificity
  3. Add validation for opf_path
  4. Change Python version to >=3.12
  5. Add summary statistics for better UX
  6. Consider standardizing quote style across the repo

The epub metadata extraction logic is sound and handles the EPUB standard correctly. Once the critical issues are addressed, this will be a useful addition to the toolset.

- Add specific exception handling for BadZipFile, ParseError, KeyError
- Log errors to stderr for better debugging
- Validate opf_path before using it
@claude
Copy link

claude bot commented Jan 16, 2026

Pull Request Review: Add epub rename tool

Summary

This PR adds a new Python script to rename epub files using metadata extraction. Overall, the implementation is solid and follows the repository conventions. However, I've identified several issues that should be addressed.

Issues Found

Critical Issues

1. Python version requirement mismatch (python/rename_epubs.py:2)

  • Python 3.14 doesn't exist yet (latest stable is 3.13)
  • The existing script uses >=3.12 which is more appropriate
  • Recommendation: Change to requires-python = ">=3.12" for consistency

2. File collision handling (python/rename_epubs.py:78)

  • No check if new_path already exists as a different file
  • Could overwrite existing files silently
  • Recommendation: Add existence check before renaming

Code Quality Issues

3. Inconsistent output style

  • The new script output differs from existing script style
  • Recommendation: Use arrow for consistency and add summary output like the existing script

4. Missing summary statistics

  • The existing rename_invoices.py provides helpful summaries
  • Recommendation: Add renamed/skipped counts for better user feedback

5. Error handling verbosity

  • Prints errors to stderr but continues silently
  • Recommendation: Consider adding a skipped files summary

Minor Improvements

6. Sanitization could be more robust

  • Doesn't handle consecutive spaces
  • Could result in filenames with multiple spaces
  • Recommendation: Add regex cleanup for whitespace

7. Type hints are incomplete

  • Could add return type hint to main() for completeness

Security Considerations

Good:

  • Uses context manager for zipfile handling
  • Sanitizes filenames to prevent path traversal
  • No execution of untrusted code

Performance Considerations

Good:

  • Efficient use of glob() and lazy evaluation
  • Doesn't load entire epub into memory
  • Only reads necessary metadata files

Testing

Missing:

  • No tests provided (though none exist for the existing script either)

Positive Aspects

  • Follows PEP 723 inline dependency specification
  • Proper docstrings
  • Uses Click for CLI consistency with existing scripts
  • README.md properly updated
  • Handles XML namespaces correctly
  • Good error handling structure
  • Dry-run flag implemented correctly

Recommendation

Request Changes - The Python version issue and file collision handling should be fixed before merging.

Suggested Changes Summary

  1. Fix Python version to >=3.12
  2. Add file collision detection
  3. Add summary statistics (renamed/skipped counts)
  4. Improve filename sanitization for whitespace
  5. Consider using arrow for consistency with existing script

- Use → arrow for consistency with rename_invoices.py
- Add renamed/skipped counters
- Display summary at end of execution
@claude
Copy link

claude bot commented Jan 16, 2026

PR Review: Add epub rename tool

Summary

This PR adds a new Python script to rename epub files using metadata extracted from the epub file itself. The implementation follows the repository's conventions well and is a solid addition to the tools collection.

✅ Strengths

  1. Consistent with repository patterns: Follows the same structure as rename_invoices.py with PEP 723 inline dependencies, Click CLI, and --dry-run option
  2. Good error handling: Properly catches zipfile.BadZipFile, ET.ParseError, and KeyError exceptions
  3. User-friendly output: Clear messaging for all scenarios (success, skips, dry-run)
  4. Proper filename sanitization: Handles invalid filesystem characters
  5. README documentation: Correctly updated with usage example

🐛 Issues & Concerns

1. Python version requirement too strict (python/rename_epubs.py:2)

Issue: Python 3.14 doesn't exist yet (current stable is 3.13). The existing rename_invoices.py uses >=3.12.

Recommendation: Change to >=3.12 for consistency and compatibility.


2. File collision handling (python/rename_epubs.py:82-92)

Issue: If the target filename already exists (e.g., two different epubs with the same author and title), the script will crash with a FileExistsError.

Recommendation: Add collision detection before renaming.


3. Potential whitespace issue in filenames (python/rename_epubs.py:56)

Issue: The sanitization only strips leading/trailing whitespace but doesn't handle multiple consecutive spaces in the middle, which could result in filenames like "Author - Title.epub".

Recommendation: Add space normalization to handle multiple consecutive spaces.


4. Inconsistent quote style (python/rename_epubs.py:53)

Minor: Uses single quotes while rest of file uses double quotes.

Recommendation: Use double quotes consistently throughout the file.


🔒 Security Considerations

  • Path traversal: Safe - only processes files in specified directory
  • Zip bomb protection: Missing, but likely acceptable for personal tool
  • XML entity expansion: Uses ET.fromstring() which is vulnerable to XXE attacks in malicious epub files

For a personal tool, current security is sufficient. For public/shared use, consider hardening.


⚡ Performance Considerations

No issues identified - sequential processing is appropriate for filesystem operations.


🧪 Test Coverage

No tests exist for this script or rename_invoices.py. While not blocking for utility scripts, basic tests would improve maintainability.


✅ Verdict

Recommend approval with minor fixes

The core functionality is solid and code quality is good. Key issues to address:

Must fix before merge:

  1. Change requires-python = ">=3.14" to ">=3.12"
  2. Add file collision detection

Recommended:
3. Normalize multiple consecutive spaces in sanitized filenames
4. Use consistent quote style

Great work on following the repository conventions!

- Normalize multiple consecutive spaces to single space in filenames
- Use double quotes consistently throughout the file
@claude
Copy link

claude bot commented Jan 16, 2026

Pull Request Review: Add epub rename tool

Overall, this is a well-structured addition that follows the repository conventions. The code is clean and functional, but there are several issues that should be addressed.

Bugs and Issues

CRITICAL - File Collision Handling Missing

  • Location: python/rename_epubs.py:94
  • Issue: The script does not check if the target filename already exists before renaming. If two different epub files have the same author/title metadata, the second rename will overwrite the first file, causing data loss.
  • Fix: Add a check for new_path.exists() and either skip or append a suffix like (2) to avoid collisions.

Python Version Requirement Too Restrictive

  • Location: python/rename_epubs.py:2
  • Issue: Requires Python 3.14+ but this version does not exist yet (latest stable is 3.13). This appears to be a typo.
  • Fix: Change to >=3.12 (matching the pattern from rename_invoices.py:2) or >=3.8 for broader compatibility. The code uses no Python 3.14-specific features.

Code Quality and Best Practices

XML Parsing Security Concern

  • Location: python/rename_epubs.py:24,35
  • Issue: Using ET.fromstring() without defusedxml protection makes the script vulnerable to XML bomb attacks and XXE vulnerabilities when processing malicious epub files.
  • Recommendation: While this is a local utility script, consider adding a comment acknowledging this limitation, or use defusedxml if processing untrusted files.

Inconsistent Naming Patterns

  • Location: python/rename_epubs.py:55
  • Issue: The sanitization removes invalid filename characters but does not normalize whitespace/case like rename_invoices.py does. This creates inconsistent behavior across tools in the same repo.
  • Suggestion: Consider if you want consistent filename patterns (e.g., should spaces be converted to hyphens, should case be normalized?).

Error Handling Could Be More Specific

  • Location: python/rename_epubs.py:48-50
  • Issue: The catch-all Exception handler masks specific errors. For example, FileNotFoundError for missing container.xml should be handled separately from parsing errors.
  • Suggestion: Handle FileNotFoundError separately with a clearer message like Invalid epub format: missing container.xml.

Positives

  • Follows PEP 723 inline dependency spec correctly
  • Consistent with repo patterns (click usage, dry-run flag, output formatting)
  • Good separation of concerns (metadata extraction, sanitization, main logic)
  • Proper error messages with file context
  • Type hints are present and correct
  • README documentation added appropriately

Minor Improvements

Metadata Extraction Edge Cases

  • Location: python/rename_epubs.py:41-42
  • Suggestion: Some epub files have multiple creator tags (co-authors) or refinements. Consider handling the first/primary author explicitly if multiple are present.

Counter Logic

  • Location: python/rename_epubs.py:87-89,96
  • Issue: Files that are already named correctly do not increment renamed_count or skipped_count, making the final summary inaccurate if you have a mix of files.
  • Fix: These should probably increment a separate already_correct_count or not print at all if zero.

Recommendations

Priority fixes before merge:

  1. Fix file collision handling (data loss risk)
  2. Correct Python version requirement (blocks usage)

Nice to have:
3. Handle multiple authors more explicitly
4. Add comment about XML security limitations
5. Consider filename normalization consistency with other tools

Let me know if you would like me to elaborate on any of these points!

@kzuraw kzuraw merged commit b97de86 into main Jan 16, 2026
1 check passed
@kzuraw kzuraw deleted the claude/add-epub-rename-tool-Y4V7K branch January 16, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants