Use PyPDF2 - which PyPDF 2 or PyPDF 3 should be used?

December 03, 2018 python

Page content

Introduction

In previous article, we can extract text on a PDF file using PyPDF2.

I will introduce PyPDF3 in this article.

When I looked for various usage of PyPDF2, I found the follwing commnet in StackOverflow.

stack_overflow

The PyPDF2 has been stopped since 3 years ago?! And, new version PyPDF3 exists?! Really?

Which should I use PyPDF2 or PyPDF3 ??

Does PyPDF3 exist on PyPI? Check with pip command.

This is PyPDF2.

1pip search PyPDF2
2> PyPDF2 (1.26.0)   - PDF toolkit

This is PyPDF3.

1pip search PyPDF3
2> PyPDF3 (1.0.1)  - Pure Python PDF toolkit

Both are really present!!

In this section, I show my understanding about PyPDF3 by reading roadmap on Github and another resources.

Volunteers have started PyPDF3 project that is based on PyPDF2 because PyPDF2 has not been updated since 3 years ago.
Initial goals are to fully implement existing features and fix some of the most critical bugs/performance issues from PyPDF2 before moving on to new functionality.
However, development is not active as far as seeing the commit log.

As a further investigation, I got to one github issue.

reboot_pypdf2

In summarize..

We can use PyPDF2 without problems.

I checked issues and pull requests in PyPDF2 repository and I understand that PyPDF2 is still alive.