Skip to content

Shah06/ocr_pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

ocr_pdf

Simple python3 script to make PDF searchable. Made to make my life easier when scanning schoolwork.

Requirements:

  1. Tesseract. Tested with 4.1.1
  2. PIL and pdf2image.

Usage:

usage: makepdf.py [-h] [-o OUTPUT] input_file

Simple script to make PDFs searchable

positional arguments:
  input_file            filepath for PDF to be targeted

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        output filepath

Upcoming (?) Features:

  1. Use /tmp/ to store files
  2. Ability to output to plaintext
  3. Take in other image formats as input, and output a PDF
  4. Ability to apply an adaptive contrast/sharpening filter to PDF

About

Simple script to make PDF searchable

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages