Skip to content

mbox export has no limits, can easily hit memory limits

the export mbox option has no hard limits, and while it attempts to stream the response back to the user, it ends up loading all the messages in memory, which can trigger OOMs on large mailing lists.

The query uses query.order_by("archived_date").all(), which ends up loading all results into memory before it starts iterating. Using .all() also adds the entries to the QuerySet cache, so it seemingly doesn't release the memory back despite being unlikely to reuse the query result.

An easy improvement would be to use iterator which will stream the results for Postgres and bypass the QuerySet cache.

For MySQL/SQLite users, some kind of batching strategy could be introduced, maybe going month by month to select messages then loading those batches. Or selecting ids and then batch loading the messages. I'm sure there are other strategies that could do a better job balancing memory consumption while maximizing throughput.