You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since v0.12.0 I seem to get this sort of backtrace when loading certain .pdf files:
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\py_pdf_parser\loaders.py", line 41, in load_file
return load(in_file, pdf_file_path=path_to_file, la_params=la_params, **kwargs)
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\py_pdf_parser\loaders.py", line 75, in load
for page in extract_pages(
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\high_level.py", line 197, in extract_pages
for page in PDFPage.get_pages(
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfpage.py", line 151, in get_pages
doc = PDFDocument(parser, password=password, caching=caching)
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 744, in __init__
self._initialize_password(password)
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 771, in _initialize_password
handler = factory(docid, param, password)
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 358, in __init__
self.init()
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 366, in init
self.init_key()
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 379, in init_key
self.key = self.authenticate(self.password)
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\pdfminer\pdfdocument.py", line 428, in authenticate
password_bytes = password.encode("latin1")
AttributeError: 'NoneType' object has no attribute 'encode'
Not sure why it only happens with certain files -- has to hit if "Encrypt" in trailer: in pdfdocument.py of pdfminer.six which only happens with certain files? -- but < v0.12.0 is fine. The problem seems to be with: password: str = None that was added in py_pdf_parser/loaders.py for load(...) as part of 02f92ce. I guess this needs to be changed to password: str = "" to match what pdfminer.six has as its default (see pdfpage.py, get_pages) and then everything should be fine again.
The text was updated successfully, but these errors were encountered:
Bug Report
Since v0.12.0 I seem to get this sort of backtrace when loading certain .pdf files:
Not sure why it only happens with certain files -- has to hit
if "Encrypt" in trailer:
inpdfdocument.py
of pdfminer.six which only happens with certain files? -- but < v0.12.0 is fine. The problem seems to be with:password: str = None
that was added inpy_pdf_parser/loaders.py
forload(...)
as part of 02f92ce. I guess this needs to be changed topassword: str = ""
to match what pdfminer.six has as its default (see pdfpage.py,get_pages
) and then everything should be fine again.The text was updated successfully, but these errors were encountered: