fix: Handle exceptions in mbox iteration
Mails containing invalid characters in the envelope from (for example, a letter with an accent) cause the hyperkitty_import command to crash.
We got such mails on a Mailman2 archive. At the time, addresses with accents were accepted by Python2's mailbox
, but those addresses are now considered invalid on Python3.
Here is an example stack trace:
(...)
81c91f6f231ff69e8eeee0d10030e93e@gmail.com (13315)
b0f56cffa98121c1476573de0749c99d@maximumbest.eu (13316)
Traceback (most recent call last):
File "/opt/mailman-web/./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/usr/lib/python3.11/site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
utility.execute()
File "/usr/lib/python3.11/site-packages/django/core/management/__init__.py", line 440, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/lib/python3.11/site-packages/django/core/management/base.py", line 402, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/lib/python3.11/site-packages/django/core/management/base.py", line 448, in execute
output = self.handle(*args, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/hyperkitty/management/commands/hyperkitty_import.py", line 403, in handle
importer.from_mbox(mbfile, report_name)
File "/usr/lib/python3.11/site-packages/hyperkitty/management/commands/hyperkitty_import.py", line 190, in from_mbox
for msg in mbox:
File "/usr/lib/python3.11/mailbox.py", line 110, in itervalues
value = self[key]
~~~~^^^^^
File "/usr/lib/python3.11/mailbox.py", line 74, in __getitem__
return self.get_message(key)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/mailbox.py", line 781, in get_message
from_line = self._file.readline().replace(linesep, b'').decode('ascii')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 20: ordinal not in range(128)
With this commit, any exception during mbox iteration will be caught and the import of the rest of the archive will continue