Run GC compact before forking into Puma workers (nakayoshi_fork)
We recently looked into the new GC compaction feature of Ruby 2.7: https://gitlab.com/gitlab-org/memory-team/memory-team-2gb-week/-/issues/1
The conclusion is that there is a noticeable improvement in shared memory use right after forking Puma workers, however, this effect deteriorates over time to approach a memory layout similar to what we would see without compaction.
Since compact-on-fork is relatively cheap, however, I think we should go ahead and at least test it in production behind a feature flag, since our tests were focused on 2GB Omnibus VMs or local deployments.
We don't have to do this manually; Puma 5 has a new setting called nakayoshi_fork
, which when enabled does roughly what is described in this issue (it first runs several minor GCs so as to tenure all retained objects, then runs compact
before forking off workers: https://github.com/puma/puma/blob/de632261ac45d7dd85230c83f6af6dd720f1cbd9/lib/puma/util.rb#L26-L35)
This means we should upgrade to Puma 5 first, which we had planned to do anyway.