So when I use the pdf2image python import, and pass a multi page PDF into the convert_from_bytes()- or convert_from_path() method, the output array does contain multiple images - but all images are of the last PDF page (whereas I would've expected that each image represented one of the PDF pages).
The output looks something like this:
Any idea on why this would occur? I can't find any solution to this online. I've found some vague suggestion that the use_cropbox argument might be used, but modifying it has no effect.
def convert(opened_file) # Read PDF and convert pages to PPM image objects try: _ppm_pages = self.pdf2image.convert_from_bytes( opened_file.read(), grayscale = True ) except Exception as e: print(f"[CreateJPEG] Could not convert PDF pages to JPEG image due to error: \n '{e}'") return # Do stuff with _ppm_pages for img in _ppm_pages: img.show() # ...all images in that list are of the last pageSometimes the output is an empty 1x1 image, instead, which I also haven't found a reason for. So if you have any idea what that is about, please do let me know!
Thanks in advance,Simon
EDIT: Added code.
EDIT: So, when I try this in a random notebook, it actually works fine.
I've removed a few detours I used in my original code, and now it works. Still not sure what the underlying reason was though...
All the same, thanks for your help, everyone!
