Import big projects for customers
TL;DR
Customers and potential ones may want to import a large project into GitLab.com. This isssue is to provide a workaround in a timely manner for them.
Background
In the past, for big imports such as the K8S import, we had a dedicated instance that helped us achieve this.
Large imports, will either timeout (we have a timeout of a few hours) or will get killed by the Sidekiq Memory Killer. This is becoming a bit more frequent as the app grows larger, and the Sidekiq process eats more memory.
We improved the Import mechanism so it doesn't use as much memory at the cost of getting a bit slower (executing less transactions in a single commit, keeping less objects in memory). This can be tweaked, but either way, we'll either hit a memory issue or a timeout problem. The next big step would be to separate this into different independent workers in order to save memory, but it's not a small refactor.
Why?
This will help Support/Sales
Current workarounds
For a self-hosted instance, this is easy to workaround by either increasing the Sidekiq RSS memory allowed and/or disabling the worker that kills the imports when a certain timeout is reached.
For GitLab.com, we can't tweak that. So we normally get rid of pipelines
or heavy objects in the exported project in order to free a bit more memory. But we could still encounter the problem.
Example
I managed to import a ~900MB export in my DO instance, averaging an RSS of 850MB, peak at 950MB (note that practically half of this is just loading the Rails app). The default max RSS at GitLab.com is 1GB (2GB now). But this wouldn't work at GitLab.com - What's the difference? The thread where the import ran in my DO instance wasn't shared by any other thread, as that was the only thing the process was doing (the other threads were free).
Proposal
For prospects or customers, use the deploy node to run a script that does this for us, without the memory limits we would normally hit.
We'll need to:
- (James) Provide a script that calls the I/E logic provided an export archive and a target (easy).
- (James) Document somewhere how this works. We may need to use tmux/screen.
- (Infra) Confirm this is OK. I would suggest every time we have to do this we ping the oncall.
- (Support) Confirm the identity of the customer to verify access to the target namespace.
- (Support) Ping the oncall when this is required from a customer.
Alternative
Maybe we can automate this hooking it into ChatOps and use a runner with enough memory
Related:
https://gitlab.com/gitlab-com/support/dotcom/dotcom-escalations/issues/2