Simple python3 script to make PDF searchable. Made to make my life easier when scanning schoolwork.
- Tesseract. Tested with 4.1.1
PILandpdf2image.
usage: makepdf.py [-h] [-o OUTPUT] input_file
Simple script to make PDFs searchable
positional arguments:
input_file filepath for PDF to be targeted
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output filepath
- Use
/tmp/to store files - Ability to output to plaintext
- Take in other image formats as input, and output a PDF
- Ability to apply an adaptive contrast/sharpening filter to PDF