{"id":5781,"date":"2024-03-14T03:21:21","date_gmt":"2024-03-14T03:21:21","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/"},"modified":"2025-08-01T19:34:38","modified_gmt":"2025-08-01T19:34:38","slug":"how-can-python-read-a-pdf-and-write-it-to-an-excel-file","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/","title":{"rendered":"Python PDF to Excel Conversion Guide"},"content":{"rendered":"<p>To read a PDF file and write its content into an Excel file, you can use the PyPDF2 library to read the contents of the PDF file, and then use the openpyxl library to create and write the Excel file. Here is an example code to read a PDF file and write its content into an Excel file:<\/p>\n<pre class=\"post-pre\"><code><span class=\"hljs-keyword\">import<\/span> PyPDF2\r\n<span class=\"hljs-keyword\">from<\/span> openpyxl <span class=\"hljs-keyword\">import<\/span> Workbook\r\n\r\n<span class=\"hljs-comment\"># \u8bfb\u53d6PDF\u6587\u4ef6<\/span>\r\npdf_file = <span class=\"hljs-built_in\">open<\/span>(<span class=\"hljs-string\">'example.pdf'<\/span>, <span class=\"hljs-string\">'rb'<\/span>)\r\npdf_reader = PyPDF2.PdfFileReader(pdf_file)\r\n\r\n<span class=\"hljs-comment\"># \u521b\u5efaExcel\u6587\u4ef6<\/span>\r\nwb = Workbook()\r\nws = wb.active\r\n\r\n<span class=\"hljs-comment\"># \u5199\u5165PDF\u5185\u5bb9\u5230Excel\u6587\u4ef6<\/span>\r\n<span class=\"hljs-keyword\">for<\/span> page_num <span class=\"hljs-keyword\">in<\/span> <span class=\"hljs-built_in\">range<\/span>(pdf_reader.numPages):\r\n    page = pdf_reader.getPage(page_num)\r\n    text = page.extract_text()\r\n    lines = text.split(<span class=\"hljs-string\">'\\n'<\/span>)\r\n    <span class=\"hljs-keyword\">for<\/span> row_num, line <span class=\"hljs-keyword\">in<\/span> <span class=\"hljs-built_in\">enumerate<\/span>(lines, start=<span class=\"hljs-number\">1<\/span>):\r\n        ws.cell(row=row_num, column=<span class=\"hljs-number\">1<\/span>, value=line)\r\n\r\n<span class=\"hljs-comment\"># \u4fdd\u5b58Excel\u6587\u4ef6<\/span>\r\nwb.save(<span class=\"hljs-string\">'output.xlsx'<\/span>)\r\n\r\n<span class=\"hljs-comment\"># \u5173\u95ed\u6587\u4ef6<\/span>\r\npdf_file.close()\r\n<\/code><\/pre>\n<p>Please note that this is just a simple example code, adjustments may be needed according to the structure and content of the PDF file. Hope this helps you!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>To read a PDF file and write its content into an Excel file, you can use the PyPDF2 library to read the contents of the PDF file, and then use the openpyxl library to create and write the Excel file. Here is an example code to read a PDF file and write its content into [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[6609,6610,6608,6611,6607],"class_list":["post-5781","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-openpyxl-guide","tag-pdf-data-extraction","tag-pypdf2-tutorial","tag-python-file-conversion","tag-python-pdf-to-excel"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Python PDF to Excel Conversion Guide - Blog - Silicon Cloud<\/title>\n<meta name=\"description\" content=\"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Python PDF to Excel Conversion Guide\" \/>\n<meta property=\"og:description\" content=\"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-14T03:21:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-01T19:34:38+00:00\" \/>\n<meta name=\"author\" content=\"Benjamin Taylor\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Benjamin Taylor\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\"},\"author\":{\"name\":\"Benjamin Taylor\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/ac801fe9549a25960ce48aa2e0a691c9\"},\"headline\":\"Python PDF to Excel Conversion Guide\",\"datePublished\":\"2024-03-14T03:21:21+00:00\",\"dateModified\":\"2025-08-01T19:34:38+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\"},\"wordCount\":92,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"keywords\":[\"openpyxl guide\",\"PDF data extraction\",\"PyPDF2 tutorial\",\"Python file conversion\",\"Python PDF to Excel\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\",\"name\":\"Python PDF to Excel Conversion Guide - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-14T03:21:21+00:00\",\"dateModified\":\"2025-08-01T19:34:38+00:00\",\"description\":\"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Python PDF to Excel Conversion Guide\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/ac801fe9549a25960ce48aa2e0a691c9\",\"name\":\"Benjamin Taylor\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/ec2e3d3e2d525fd148047c4520ae7c1cdccd1f4b48a1a488422b31f04f345c14?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/ec2e3d3e2d525fd148047c4520ae7c1cdccd1f4b48a1a488422b31f04f345c14?s=96&d=mm&r=g\",\"caption\":\"Benjamin Taylor\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/benjamintaylor\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Python PDF to Excel Conversion Guide - Blog - Silicon Cloud","description":"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/","og_locale":"en_US","og_type":"article","og_title":"Python PDF to Excel Conversion Guide","og_description":"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.","og_url":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-14T03:21:21+00:00","article_modified_time":"2025-08-01T19:34:38+00:00","author":"Benjamin Taylor","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"Benjamin Taylor","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/"},"author":{"name":"Benjamin Taylor","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/ac801fe9549a25960ce48aa2e0a691c9"},"headline":"Python PDF to Excel Conversion Guide","datePublished":"2024-03-14T03:21:21+00:00","dateModified":"2025-08-01T19:34:38+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/"},"wordCount":92,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"keywords":["openpyxl guide","PDF data extraction","PyPDF2 tutorial","Python file conversion","Python PDF to Excel"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/","url":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/","name":"Python PDF to Excel Conversion Guide - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-14T03:21:21+00:00","dateModified":"2025-08-01T19:34:38+00:00","description":"Learn how to use Python to read PDF files and write content to Excel using PyPDF2 and openpyxl libraries. Step-by-step guide with code example.","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/how-can-python-read-a-pdf-and-write-it-to-an-excel-file\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Python PDF to Excel Conversion Guide"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/ac801fe9549a25960ce48aa2e0a691c9","name":"Benjamin Taylor","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/ec2e3d3e2d525fd148047c4520ae7c1cdccd1f4b48a1a488422b31f04f345c14?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/ec2e3d3e2d525fd148047c4520ae7c1cdccd1f4b48a1a488422b31f04f345c14?s=96&d=mm&r=g","caption":"Benjamin Taylor"},"url":"https:\/\/www.silicloud.com\/blog\/author\/benjamintaylor\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/5781","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=5781"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/5781\/revisions"}],"predecessor-version":[{"id":150539,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/5781\/revisions\/150539"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=5781"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=5781"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=5781"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}