Batch Convert PDF to Images in Python

2 years ago

Noah Thompson

1 minute

To convert PDF files to images in Python, PyPDF2 and Pillow libraries can be used. Here is a simple example code:

import os
from PyPDF2 import PdfFileReader
from PIL import Image

def pdf_to_images(pdf_path, output_dir):
    pdf = PdfFileReader(open(pdf_path, 'rb'))

    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    for page_num in range(pdf.getNumPages()):
        page = pdf.getPage(page_num)
        image = page.to_image()
        image_path = os.path.join(output_dir, f'page_{page_num+1}.png')
        image.save(image_path, 'PNG')

    print(f'PDF转图片完成，保存在：{output_dir}')

# 示例用法
pdf_to_images('input.pdf', 'output_images/')

This code converts each page of a PDF into a separate PNG image and saves them in the specified output directory. Make sure to have PyPDF2 and Pillow libraries installed, which can be done using the following command:

pip install PyPDF2
pip install Pillow

To use this code, simply apply the pdf_to_images function to the PDF file you want to convert and specify the output directory.

#batch conversion #convert pdf to image #Pillow library #PyPDF2 #python pdf processing