Hardcode registration TTL
Currently agent pods registered via the Register()
RPC are never removed from RAM of kas since AgentTracter.Unregister()
is never called for them. This is not ok since it's a very slow memory leak and, since data in Redis does not expire either, metrics are more and more incorrect over time. On GitLab.com it's not a problem in practice since kas is redeployed every few hours. Also, agent churn is very low so we don't even notice this.
First commit fixes the above problem by only persisting registration info in Redis and not in kas' RAM. That way those k->v mappings in Redis are not refreshed as part of periodic refresh. They are GCed if they expire, as today.
The second commit hardcodes the registration TTL since it must be consistent across agentk and kas and hence cannot be configurable. agentk calls Register()
every 5 minutes. If kas is configured with TTL of 1 minute then metrics would be incorrect all the time since registrations would expire after a minute and will be wrong for the next 4 minutes.
Merge request reports
Activity
changed milestone to %16.8
added devopsdeploy groupenvironments sectioncd typebug labels
requested review from @takax
assigned to @ash2k
added 32 commits
-
05eb6136...5b370ffd - 30 commits from branch
master
- 0d990dfd - Make registrations via Register() RPC expire after TTL
- 7dd30e31 - Hardcode registration TTL
-
05eb6136...5b370ffd - 30 commits from branch
requested review from @timofurrer
mentioned in commit 129a4c66
added workflowstaging-canary label
added workflowcanary label and removed workflowstaging-canary label
added workflowstaging label and removed workflowcanary label
added workflowproduction label and removed workflowstaging label
mentioned in issue gitlab-org/gitlab#432929
added releasedcandidate label
added releasedpublished label and removed releasedcandidate label