-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The current process for extracting images goes as follows:
- Find images within the document
- Render each page of the document as an image
- Use sharp to extract the images from the rendered pages
This process is both slow and requires an external library (sharp), that in turn has a native dependency (libvips).
In an ideal world we would extract the images directly from the PDF. As of right now I've not found a way to do this, but perhaps we could get some clues from pdfcpu (https://github.com/pdfcpu/pdfcpu/blob/6b2e3b4ba26ed6839410ca2fd00f21cb4649efbe/pkg/pdfcpu/extract.go#L51).
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request