Old PyPDF2 dependency: cannot read PDF with modern encryption
Mayan uses an old version of PyPDF2 (1.28.4) which only supports old 40-bit RC4 encrypted PDF.
# supervisorctl tail -f mayan-edms-worker_c stderr
==> Press Ctrl-C to exit <== ,
line 446, in _get_num_pages raise PdfReadError("File has not been decrypted")
PyPDF2.errors.PdfReadError: File has not been decrypted
During handling of the above exception, another exception occurred:Traceback (most recent call last): File "/opt/mayan-edms/lib/python3.10/site-packages/mayan/apps/converter/backends/python.py", line 110, in get_page_count pdf_reader.decrypt(password=b'') File "/opt/mayan-edms/lib/python3.10/site-packages/PyPDF2/_reader.py", line 1723, in decrypt return self._decrypt(password) File "/opt/mayan-edms/lib/python3.10/site-packages/PyPDF2/_reader.py", line 1762, in _decrypt raise NotImplementedError( NotImplementedError: only algorithm code 1 and 2 are supported. This PDF uses code 4 mayan.apps.documents.models.document_model_mixins <4879> [INFO] "file_new() line 161 New document file queued for document: document.pdf"
The error is not visible to the end users. The document thumbnail is simply not generated.
I manually upgraded to PyPDF2==2.12.1, and the parsing works:
undefined mayan.apps.documents.models.document_model_mixins \<13068\> \[INFO\] "file_new() line 108
Creating new document file for document: Document.pdf" \[2023-09-04 00:03:14,909: INFO/ForkPoolWorker-1\]
Creating new document file for document: Document.pdf
mayan.apps.documents.models.document_file_model_mixins \<13068\> \[INFO\] "\_create() line 82 Creating new file for document: Document.pdf"
\[2023-09-04 00:03:14,910: INFO/ForkPoolWorker-1\] Creating new file for document: Document.pdf mayan.apps.documents.models.document_file_model_mixins
\<13068\> \[INFO\] "\_create() line 95 New document file "Document.pdf" created for document: Document.pdf" \[2023-09-04 00:03:14,960: INFO/ForkPoolWorker-1\] New document file "Document.pdf" created for document: Document.pdf mayan.apps.documents.models.document_model_mixins
\<13068\> \[INFO\] "file_new() line 161 New document file queued for document: Document.pdf"
\[2023-09-04 00:03:15,333: INFO/ForkPoolWorker-1\] New document file queued for document: Document.pdf
Related issue: https://stackoverflow.com/questions/50751267/only-algorithm-code-1-and-2-are-supported
Edited by Arya Senna