Mailman3 uses way too much memory
I run Mailman3 in a resource constrained environment (a shared hoster, 64 bit Python, the VM has 20 cores, my cap is 1.5 gigabytes of memory). Mailman3 uses something like 800 megabytes to 1 gigabytes of that (depending a little on how you count). This not only prevents me from running other programs, but sometimes a runner is killed for using too much memory, and is not restarted (I did some investigating into that: #887 (comment 1219546699)).
I think that we could reduce the memory consumption by a factor of 10 without compromising the architecture and without much effort:
- Reducing the number of Django's Q-runners from the default (number of cores) to something reasonable (probably installation dependent, I don't know what the system is used for). In my installations, reducing from the default (20 in my case of 20 cores) to 1 (smallish installations, I figured one should suffice) saved ~100 megabytes (20 runners times a bit over five megabytes of memory each).
- Dropping docstrings from the heaps using
PYTHONOPTIMIZE=2
. This required a tiny fix inzope.components
(https://github.com/zopefoundation/zope.component/issues/67). Saves ~5 megabytes per master/runner process, of which there are 1+15, and a bit in the wsgi processes, for a total of 100 megabytes again. - There are many runners, each consuming ~50 to ~80 megabytes. I'd guess they share a lot of framework, but due to how Python works, this is not shared across processes. Fortunately, I think this kind of sharing could be achieved with relative ease by loading the framework in the master process, using the new freeze API if available (https://docs.python.org/3/library/gc.html#gc.freeze), then fork and execute the runner in the child process without execve. Impact is hard to estimate, but I don't think that reduction by a factor of 10 is outlandish given that there are 16 very similar processes.
The first two points require no / little changes to the code, but require a bit of documentation. I did both on my installations to great effect, reducing the memory consumption by ~200 megabytes, or ~20% each. I'm happy to create merge requests to that effect.
I'm also happy to try the third point, as it will bring the biggest win. As it is a bit more involved, I'd like to get some feedback first.