Skip to content

Isaccseven/pdf2text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Pdf2text

Description

  • extracts the text from your pdf using ocr with pytesseract
  • converts text to mp3

Requirements

pip install Pillow pdf2image pytesseract typer rich click_spinner gtts

Extract

python pdf2text.py extract "input_path" "output_path"

Generate

python pdf2text.py generate "input_path" "output_path" language

Example

python pdf2text.py extract "test.pdf" "text.txt"

python pdf2text.py generate "test.txt" "test.mp3" en

About

Extract text from pdf using ocr

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages