Improve ability to debug Unicorn problems
Problem to solve
Improve the ability to debug Unicorn problems.
Target audience
-
Devon, DevOps Engineer, https://design.gitlab.com/research/personas#persona-devon
-
Sidney, Systems Administrator, https://design.gitlab.com/research/personas#persona-sidney
Further details
Unicorn can be a bottleneck for performance and reliability of a GitLab instance.
Common problems:
- OOM conditions
- CPU saturation/starvation
- Too few workers causing request queuing
Proposal
Add additional metrics to track Unicorn performance:
-
process_start_time_seconds{worker="ID"}
- Gather the per-worker start time from/proc/$PID/stat
. -
process_cpu_seconds_total{worker="ID"}
- Gather the per-worker CPU time from/proc/$PID/stat
. -
process_max_fds{worker="ID"}
- How many FDs are available to the process. -
unicorn_workers
- The number of running unicorn workers.
What does success look like, and how can we measure that?
We have new metrics available for monitoring.
Links / references
Edited by Joshua Lambert