Conversation
Extracts dates from PDF content and renames invoice files to 'yyyy-mm company_name invoice_number.pdf' format with formatted invoice numbers (whitespace removed, / converted to -). https://claude.ai/code/session_01AgLRG2yeADzgzH9VjTPUW2
Remove pypdf dependency - use file metadata instead of parsing PDF content. https://claude.ai/code/session_01AgLRG2yeADzgzH9VjTPUW2
Code Review for PR #37: Add invoice PDF renaming utility scriptThank you for this contribution! The script follows the established patterns well and implements a useful utility. Here are my findings: ✅ Strengths
🐛 Issues & BugsCritical: Missing backslash in sanitize_filename (python/rename_invoices.py:35)The invalid_chars string is missing the backslash character. It should be: '<>:"/\|?*' to match rename_epubs.py:63. This is important for cross-platform compatibility. Bug: Missing continue statement (python/rename_invoices.py:66-67)When a file already has a date prefix, the code echoes a message but doesn't continue to skip further processing. Add an explicit 'continue' after line 67 to prevent the code from trying to parse and rename these files. Potential Issue: Company name parsing too simplistic (python/rename_invoices.py:71-76)The script assumes company name is the first word using split(maxsplit=1). This works for single-word companies but might not handle multi-word company names as expected. Consider documenting the expected input format more explicitly. 🔍 Code Quality & Best Practices
🚀 PerformanceNo concerns - the script efficiently processes files with minimal I/O operations. 🔒 SecurityNo security concerns identified. The script properly sanitizes filenames and provides dry-run mode. 📝 Suggestions
SummaryThe code is well-structured and functional, but please address the missing backslash in invalid_chars and add the explicit continue statement after detecting files with existing date prefixes. These are important for correctness and preventing potential bugs. Great work overall! 🎉 |
- Process files that already have date prefix (strip and reprocess) - Update to Python 3.14 - Add missing backslash to invalid_chars for cross-platform compatibility https://claude.ai/code/session_01AgLRG2yeADzgzH9VjTPUW2
Deploying tools with
|
| Latest commit: |
8d663f5
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d2342909.tools-4nd.pages.dev |
| Branch Preview URL: | https://claude-rename-invoice-pdfs-6.tools-4nd.pages.dev |
Pull Request ReviewOverall, this is a well-structured utility script that follows the repository's conventions. The code is clean and readable with good error handling. I've identified a few areas for improvement: ✅ Strengths
🐛 Potential Bugs1. Date prefix detection issue (python/rename_invoices.py:66) if re.match(r"^\d{4}-\d{2}\s", stem):
stem = re.sub(r"^\d{4}-\d{2}\s+", "", stem)Issue: The regex patterns differ - Fix: Use consistent regex pattern: if re.match(r"^\d{4}-\d{2}\s+", stem):
stem = re.sub(r"^\d{4}-\d{2}\s+", "", stem)2. Already-correctly-named files counted incorrectly (python/rename_invoices.py:96) When a file is already correctly named ( Current behavior: Silent skip (no counter increment) Suggested improvement: Add explicit handling like in if new_path == pdf_path:
click.echo(f"Already named correctly: {pdf_path.name}")
continue
|
| Category | Status |
|---|---|
| Code Quality | ✅ Good |
| Consistency | ✅ Follows repo patterns |
| Bugs | |
| Edge Cases | |
| Security | ✅ No concerns |
| Performance | ✅ Appropriate |
Recommended Actions
Must fix:
- Fix the regex inconsistency in the date prefix detection (issue Create index.html with links to HTML files #1)
Nice to have:
- Add "already named correctly" message handling (issue Create GitHub Pages build workflow #2)
- Normalize whitespace in
sanitize_filename()for consistency (issue Add CREATE TABLE formatting support #5) - Document the single-word company name requirement or enhance parsing
Despite these issues, the code is production-ready with the regex fix applied. Great work maintaining consistency with the existing codebase! 🎉
Pull Request ReviewThanks for this contribution! I've reviewed the PR and have the following feedback: ✅ Strengths
🐛 Bug: Date Prefix Stripping LogicLocation: This logic strips existing date prefixes but then re-adds them based on current file creation time. This creates a problem where files already correctly named won't match Recommendation: Remove the date-stripping logic (lines 65-67) and add an early check like 🔧 Minor Issues
🎯 Code QualityType hints, function decomposition, and PEP 723 compliance all look good! 🔒 Security & PerformanceNo concerns identified. SummarySolid addition! Please address the date prefix stripping bug and consider the minor improvements. 🤖 Review by Claude Code |
- Skip files that already have yyyy-mm prefix instead of re-processing - Add early check for files already named correctly before duplicate check https://claude.ai/code/session_01AgLRG2yeADzgzH9VjTPUW2
PR Review: Add invoice PDF renaming utility scriptOverall AssessmentGood implementation that follows the established patterns in the repository. The code is clean, well-documented, and includes proper error handling. However, there are a few issues that should be addressed before merging. Issues Found1. Critical: Missing duplicate filename handling in rename_epubs.py pattern
|
Summary
Add a new utility script
rename_invoices.pythat automatically renames invoice PDFs to a standardized format with date prefixes based on file creation metadata.Changes
python/rename_invoices.py- Renames invoice PDFs fromcompany_name invoice_number.pdftoyyyy-mm company_name invoice_number.pdfformat/to---dry-runflag to preview changes before applying themImplementation Details
st_birthtime(creation time) when available, falls back tost_mtimefor cross-platform compatibilityrename_epubs.pyscript using Click frameworkhttps://claude.ai/code/session_01AgLRG2yeADzgzH9VjTPUW2