Skip to content

Exception in the HyperKitty archiver: 'utf-8' codec can't encode characters

Hi,

I'm seeing the above error in mailman-core logs each time an email is processed. It is caused by serialized messages sitting in var/archives/hyperkitty/spool/. Some of them are spam and some others are legitimate emails.

The error traceback says:

Jun 16 19:06:35 2020 (17) Exception in the HyperKitty archiver: 'utf-8' codec can't encode characters in position 2262-2263: surrogates not allowed
Jun 16 19:06:35 2020 (17) Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/mailman_hyperkitty/__init__.py", line 154, in _archive_message
    url = self._send_message(mlist, msg)
  File "/usr/lib/python3.6/site-packages/mailman_hyperkitty/__init__.py", line 201, in _send_message
    files={"message": ("message.txt", message_text)})
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/usr/lib/python3.6/site-packages/requests/models.py", line 316, in prepare
    self.prepare_body(data, files, json)
  File "/usr/lib/python3.6/site-packages/requests/models.py", line 504, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "/usr/lib/python3.6/site-packages/requests/models.py", line 169, in _encode_files
    body, content_type = encode_multipart_formdata(new_fields)
  File "/usr/lib/python3.6/site-packages/urllib3/filepost.py", line 88, in encode_multipart_formdata
    writer(body).write(data)
  File "/usr/lib/python3.6/codecs.py", line 376, in write
    data, consumed = self.encode(object, self.errors)
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 2262-2263: surrogates not allowed

Processing the offending pickle file with 'mailman qfile ' returns a similar error:

[----- start pickle -----]
<----- start object 1 ----->
Traceback (most recent call last):
  File "/usr/bin/mailman", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.6/site-packages/mailman/bin/mailman.py", line 68, in invoke
    return super().invoke(ctx)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/mailman/commands/cli_qfile.py", line 74, in qfile
    printer.pprint(obj)
  File "/usr/lib/python3.6/pprint.py", line 139, in pprint
    self._format(object, self._stream, 0, 0, {}, 0)
  File "/usr/lib/python3.6/pprint.py", line 176, in _format
    stream.write(rep)
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 2262-2263: surrogates not allowed

Attached is a redacted spam email producing such an error.

I'm using dockerized mailman 0.3 from maxking. The System Information page says:

Product Version
Mailman Core Version GNU Mailman 3.3.0 (La Villa Strangiato)
Mailman Core API Version 3.0
Mailman Core Python Version 3.6.9 (default, Oct 17 2019, 11:17:29) [GCC 6.4.0]

Please tell me if you need some more information.

example.eml