Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

pdf2image conversion of multi page PDFs to images returns the last page on all images

$
0
0

So when I use the pdf2image python import, and pass a multi page PDF into the convert_from_bytes()- or convert_from_path() method, the output array does contain multiple images - but all images are of the last PDF page (whereas I would've expected that each image represented one of the PDF pages).

The output looks something like this:

pdf2image conversion bug

Any idea on why this would occur? I can't find any solution to this online. I've found some vague suggestion that the use_cropbox argument might be used, but modifying it has no effect.

def convert(opened_file)    # Read PDF and convert pages to PPM image objects    try:        _ppm_pages = self.pdf2image.convert_from_bytes(            opened_file.read(),            grayscale = True        )    except Exception as e:        print(f"[CreateJPEG] Could not convert PDF pages to JPEG image due to error: \n    '{e}'")        return    # Do stuff with _ppm_pages    for img in _ppm_pages:        img.show() # ...all images in that list are of the last page

Sometimes the output is an empty 1x1 image, instead, which I also haven't found a reason for. So if you have any idea what that is about, please do let me know!

Thanks in advance,Simon

EDIT: Added code.

EDIT: So, when I try this in a random notebook, it actually works fine.

I've removed a few detours I used in my original code, and now it works. Still not sure what the underlying reason was though...

All the same, thanks for your help, everyone!


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>