Memory Group - 14.8 Planning
This page may contain information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
Capacity
No noteworthy PTO is expected that could impact the capacity for %14.8
Planning
In %14.8 we continue our focus on performance related tooling, monitoring and on initiatives that can increase GitLab's reliability, availability and performance.
%14.8
Top Priorities for Performance related tooling
Who: @alipniagov
@rzwambag
We want to enable every GitLab Team Member to make data-driven decisions and understand performance related issues, memory consumption, or CPU saturation. That requires sharp tooling and clear, simple to understand performance guidelines. If we are successful, updates released will be more performant and less resource intensive as we iterate on this topic.
Priority topics for %14.8:
-
Consolidate our profiling tools
We currently provide an extended set of tools for performance profiling, with overlapping functionality and no clear guidance on when these tools should be used. Our plan is to review which of these tools are in use, add value, are most widely applicable, and whether we can condense them into a more manageable tool-belt. We'll deprecate unused tools, fix broken useful tools and provide guidelines for developers.
Our long term plan is to:
- Consolidate our tools for performance profiling
- Define which tools are useful for which cases
- Improve documentation around performance practices and create clear guidelines for developers (in one place)
- Improve development practices around performance
- Add more measurements, tools or ways to automate our profiling (e.g. by adding an API endpoint)
- Investigate how we can automate performance profiling (shift left)
Depending on capacity, we may add more tasks from gitlab-org&5413 and gitlab-org&1415, like improving the documentation around performance practices (gitlab-org/gitlab#333647)
Move metrics server out of Puma primary
Who: @mkaeppler
Following the successful completion of exporting Sidekiq metrics from a separate process, we are looking to extract the metrics server thread running in the Puma primary into a separate server process to improve fault tolerance and GitLab availability.
This is in response to incidents we have seen in the past such as gitlab-org/gitlab#118839 (closed), in which we found that the in-process Rack server in the Puma master can lock up the entire process.
In the context of this effort we will also investigate how to more efficiently export metrics and may consider rewriting the metrics exporter in Golang. As a proof of concept, we have implemented app-export, a Prometheus exporter for the GitLab application written in Go. We will benchmark the performance and decide whether we can use it to replace the existing Ruby implementation.
Instrument tracking of application boot time
Who: @nmilojevic1
Internal reports (gitlab-org/gitlab#213992) indicate that GitLab may be taking too long to start. We want to collect data from both GitLab.com and self-managed instances about how long it actually takes GitLab to boot in different environments.
This is related to both the composable codebase and other performance initiatives we want to work on and having metrics on this performance vector of GitLab will help us drive further decisions forward.
Composable codebase
Who: @nmilojevic1
Due to limited capacity in %14.7 and having to extend our work on the Redis instance for session keys due to gitlab-com/gl-infra/production#6090, we had to postpone working on this topic by one milestone.
Our plan for %14.8 is to clarify the Composable codebase effort (blueprint) and create an actionable plan. Our goal is to identify the impact that the composable codebase may have at various phases of our rollout plan and the additional performance benefits that we'll be able to achieve with initiatives that may depend on it (like speeding up the boot-up time).
%14.8
Secondary Priorities && Stretch goals for Address rubyzip related issues
Who: @mkaeppler
We have found that rubyzip
can run against performance issues whenever iterating zip files or reading the central directory is required. This is a well defined, known performance issue (gitlab-org/gitlab#345673) that we would like to address at some point.
Our plan is to explore using alternative libraries or tools (gitlab-org/gitlab#347233) or keep using rubyzip
and upgrade it to the latest version (gitlab-org/gitlab#346241).
Additional Issues for consideration
-
sidekiq / kube_container_memory saturation
Investigate the cause of the Sidekiq memory issues further and the reason for the increased OOM errors
%14.7
Group highlights for We have successfully rolled out the new Redis instance for session keys in GitLab.com. All the sessions traffic is served by Redis::Sessions
and we can observe a significant drop in Requests Per Second, operation rate and CPU utilization for redis-persistent
.