Errors on re-run if `plom-scan process` interrupted
The error
Got a weird error when trying to plom-scan
process a pdf. I think what happened is I started running plom-scan
process the first time without passing --extract-bitmaps
, then when it was taking a long time I interrupted with ctrl+c
and tried to re-run with plom-scan process --extract-bitmaps
. Then each time I'd do so I'd get
plom-scan process --extract-bitmaps CfAs.pdf
Checking if bundle "CfAs.pdf" already exists on server
Logging details to bundles/CfAs.pdf/processing.log
Processing PDF CfAs.pdf to images
Read QR codes
0%| | 0/253 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/fasterQRExtract.py", line 116, in QRextract
qrlist = read_barcodes(image, formats=(BarcodeFormat.QRCode | micro))
TypeError: Unsupported type <class 'PIL.PngImagePlugin.PngImageFile'>. Expect a PIL Image or numpy array
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/fkobayashi/.local/bin/plom-scan", line 8, in <module>
sys.exit(main())
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/__main__.py", line 229, in main
processScans(
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/frontend_scan.py", line 130, in processScans
readQRCodes.processBitmaps(bundledir, msgr=msgr)
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/start_messenger.py", line 55, in wrapped
return f(*args, **kwargs)
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/readQRCodes.py", line 287, in processBitmaps
decode_QRs_in_image_files(bundledir / "pageImages")
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/readQRCodes.py", line 42, in decode_QRs_in_image_files
_ = list(tqdm(p.imap_unordered(QRextract, stuff), total=N))
File "/usr/lib/python3.10/site-packages/tqdm/std.py", line 1178, in __iter__
for obj in iterable:
File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
TypeError: Unsupported type <class 'PIL.PngImagePlugin.PngImageFile'>. Expect a PIL Image or numpy array
Of course, from then on the same thing happened without passing the --extract-bitmaps
flag:
Checking if bundle "CfAs.pdf" already exists on server
Logging details to bundles/CfAs.pdf/processing.log
Processing PDF CfAs.pdf to images
Read QR codes
0%| | 0/253 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/fasterQRExtract.py", line 116, in QRextract
qrlist = read_barcodes(image, formats=(BarcodeFormat.QRCode | micro))
TypeError: Unsupported type <class 'PIL.PngImagePlugin.PngImageFile'>. Expect a PIL Image or numpy array
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/fkobayashi/.local/bin/plom-scan", line 8, in <module>
sys.exit(main())
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/__main__.py", line 229, in main
processScans(
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/frontend_scan.py", line 130, in processScans
readQRCodes.processBitmaps(bundledir, msgr=msgr)
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/start_messenger.py", line 55, in wrapped
return f(*args, **kwargs)
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/readQRCodes.py", line 287, in processBitmaps
decode_QRs_in_image_files(bundledir / "pageImages")
File "/home/fkobayashi/.local/lib/python3.10/site-packages/plom/scan/readQRCodes.py", line 42, in decode_QRs_in_image_files
_ = list(tqdm(p.imap_unordered(QRextract, stuff), total=N))
File "/usr/lib/python3.10/site-packages/tqdm/std.py", line 1178, in __iter__
for obj in iterable:
File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
TypeError: Unsupported type <class 'PIL.PngImagePlugin.PngImageFile'>. Expect a PIL Image or numpy array
Steps to reproduce
(I haven't actually tried this from scratch yet, but it's what I did to get the error in the first place)
-
plom-scan process <bundle>.pdf
. -
ctrl+c
to interrupt process mid-scan. -
plom-scan process --extract-bitmaps <bundle>.pdf
(could probably also do the--extract-bitmaps
version in step 1 but not passing the flag gives more time to interrupt). - profit
Solution
Doing rm -r bundles/CfAs.pdf
and re-running solved the problem.
Is this an issue, or is it the user's fault for trying to interrupt the scan process? I could see a similar thing happening for example if someone lost power or something while trying to scan exams and then had to restart their machine. Maybe plom-scan process
should ask if the user wants to re-run from scratch...?