mailman2_download: partial downloads
Some archive files are bigger and download fails with:
Traceback (most recent call last):
File "/usr/bin/django-admin", line 5, in <module>
management.execute_from_command_line()
File "/usr/lib/python2.7/site-packages/django/core/management/__init__.py", line 354, in execute_from_command_line
utility.execute()
File "/usr/lib/python2.7/site-packages/django/core/management/__init__.py", line 346, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute
output = self.handle(*args, **options)
File "/usr/lib/python2.7/site-packages/hyperkitty/management/commands/mailman2_download_fixed.py", line 145, in handle
[options], options["start"], MONTHS))
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 250, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
httplib.IncompleteRead: IncompleteRead(8105 bytes read)
wget confirms the server behavior:
$ wget https://www.redhat.com/archives/rdo-list/2016-October.txt.gz
--2017-08-01 14:47:41-- https://www.redhat.com/archives/rdo-list/2016-October.txt.gz
Resolving www.redhat.com (www.redhat.com)... 2600:1415:11:4ad::d44, 2600:1415:11:4a7::d44, 104.98.16.99
Connecting to www.redhat.com (www.redhat.com)|2600:1415:11:4ad::d44|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3573 (3.5K) [application/x-gzip]
Saving to: ‘2016-October.txt.gz’
2016-October.txt.g 99%[============> ] 3.49K --.-KB/s in 0s
2017-08-01 14:47:44 (53.1 MB/s) - Connection closed at byte 3572. Retrying.
--2017-08-01 14:47:45-- (try: 2) https://www.redhat.com/archives/rdo-list/2016-October.txt.gz
Connecting to www.redhat.com (www.redhat.com)|2600:1415:11:4ad::d44|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 885509 (865K), 881937 (861K) remaining [application/x-gzip]
Saving to: ‘2016-October.txt.gz’
2016-October.txt.g 100%[=============>] 864.75K 145KB/s in 5.9s
2017-08-01 14:47:53 (145 KB/s) - ‘2016-October.txt.gz’ saved [885509/885509]
I found other person having this problem here: https://stackoverflow.com/questions/14149100/incompleteread-using-httplib
The httplib monkey patch does not work (anymore?). Downgrading to HTTP 1.0 with the following (borrowed form the same page) works:
import httplib
httplib.HTTPConnection._http_vsn = 10
httplib.HTTPConnection._http_vsn_str = 'HTTP/1.0'
I have no idea how to fix this properly, so I cannot suggest a PR.