Skip to content

Record container RSS, as well as working set, for memory saturation

Sean McGivern requested to merge use-rss-not-ws-for-container-memory into master

The container's working set memory (as reported by cgroups; this is not necessarily the same as the kernel's workingset_size metric) is not a reliable indicator for memory saturation, as it includes pages from the filesystem cache that are permitted to be evicted rather than OOM kill the cgroup.

This also permits the use of resident set size (RSS). This is still not exactly what is used by the OOM killer - the thing that we're ultimately trying to approximate - but it's a lot clearer what we're measuring, and how our application can influence it. In practice, it's also a more stable metric than the working set size.

However, it does have a problem when combined with use of memory marked as MADV_FREE by an madvise call (lazy-free memory). Lazy-free memory is memory that the process no longer needs, but wishes to be able to reclaim without a page fault. Lazy-free memory is included in RSS, but not WSS, so processes that use lazy-free memory may have dramatic overestimates for RSS compared to WSS.

For now, we just opt in the GitLab Rails deployments (api, git, internal-api, sidekiq, and web) to using RSS alongside WSS, to trial this approach.

Much, much more detail in the below issue: gitlab-com/gl-infra/scalability#2024 (closed)

Edited by Sean McGivern

Merge request reports