Slow image extraction

The current process for extracting images goes as follows:
1. Find images within the document
2. Render each page of the document as an image
3. Use [sharp](https://github.com/lovell/sharp) to extract the images from the rendered pages

This process is both slow and requires an external library (sharp), that in turn has a native dependency ([libvips](https://github.com/libvips/libvips)).

In an ideal world we would extract the images directly from the PDF. As of right now I've not found a way to do this, but perhaps we could get some clues from [pdfcpu](https://pdfcpu.io/extract/extract_images.html) (https://github.com/pdfcpu/pdfcpu/blob/6b2e3b4ba26ed6839410ca2fd00f21cb4649efbe/pkg/pdfcpu/extract.go#L51).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow image extraction #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Slow image extraction #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions