Check for excessive puma restarts (!164) · Merge requests · GitLab.com / GitLab Support Team / toolbox / GitLab Detective

Adding a new check

How many puma restarts is too many? That is the biggest question here, so I've gone with a starting answer of "2 times per minute". I think we could actually go lower, to 1 time per minute, so this is a somewhat conservative start.

I do the calculation by looking at the first and last lines of the current puma_stdout.log, converting the timestamps to seconds, and calculating the duration of the log file, divided by 60 to get minutes instead of seconds. Then I grep out how many times we see "Sending TERM" in that same log file and divide by the number of minutes. Since this is bash math it's giving an integer value, effectively throwing away anything past the decimal. Since we're hand-waving here already I think that's a reasonable approach.

Closes #165

Verification steps for review

I took a set of test data from a customer's system where we recommended tuning puma memory limits. With that puma_stdout.log on my test system I get this response when running this check:

spot --ssh-agent -v -p all_playbook.yml -u root -k ~/.ssh/id_rsa -e GITLAB_VERSION:17.3.5 -n "20251107_165 - run check"
spot v1.19.1-74e1afa-2025-08-29T17:32:47Z
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] run task "20251107_165 - run check", commands: 1
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] run command "Run the check and store result"
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22]  > sudo /bin/sh -c /tmp/.spot-2780404833747254272/spot-script3340370579
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22]  > setvar JSON_DATA_20251107_165_all_diana={ "ref_url": "https://gitlab.com/gitlab-com/support/toolbox/gitlab-detective/-/issues/165", 
  "title": "Check for high puma restart frequency", "host": "interview-instance.env-57c7bd43.gcp.gitlabsandbox.net", 
  "workaround_url": "https://docs.gitlab.com/administration/operations/puma/#reducing-memory-use", "version_started": "14.0.0", 
  "version_fixed": null, "message": "Your system is showing signs of frequent puma worker restarts due to hitting configured memory 
  limits. This can result in perceived performance degradation on the system. You can alleviate this problem by tuning the puma 
  max memory configuration." }
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] completed command "Run the check and store result" {script: /bin/sh -c [multiline script]} (3.501s)
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] completed task "20251107_165 - run check", commands: 1 (4.444s)

With the boring little-to-no-activity log on my test system, I get this response:

spot --ssh-agent -v -p all_playbook.yml -u root -k ~/.ssh/id_rsa -e GITLAB_VERSION:17.3.5 -n "20251107_165 - run check"
spot v1.19.1-74e1afa-2025-08-29T17:32:47Z
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] run task "20251107_165 - run check", commands: 1
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] run command "Run the check and store result"
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22]  > sudo /bin/sh -c /tmp/.spot-6044029276355193856/spot-script2584336099
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22]  > setvar JSON_DATA_20251107_165_all_diana={ "ref_url": "https://gitlab.com/gitlab-com/support/toolbox/gitlab-detective/-/issues/165", 
  "title": "Check for high puma restart frequency", "host": "interview-instance.env-57c7bd43.gcp.gitlabsandbox.net", 
  "workaround_url": "https://docs.gitlab.com/administration/operations/puma/#reducing-memory-use", "version_started": "14.0.0", 
  "version_fixed": null }
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] completed command "Run the check and store result" {script: /bin/sh -c [multiline script]} (3.451s)
[diana interview-instance.env-57c7bd43.gcp.gitlabsandbox.net:22] completed task "20251107_165 - run check", commands: 1 (4.126s)

Author checklist

After opening the MR:
- Set it to the current milestone
- Ask the Maintainer from the Reviewer roulette suggestion for review

Reviewer checklist

Edited Nov 11, 2025 by Diana Stanley

Check for excessive puma restarts

Adding a new check

Verification steps for review

Author checklist

Reviewer checklist

Merge request reports