Skip to content

Hardcode registration TTL

Mikhail Mazurskiy requested to merge ash2k/expiring-registration into master

Currently agent pods registered via the Register() RPC are never removed from RAM of kas since AgentTracter.Unregister() is never called for them. This is not ok since it's a very slow memory leak and, since data in Redis does not expire either, metrics are more and more incorrect over time. On GitLab.com it's not a problem in practice since kas is redeployed every few hours. Also, agent churn is very low so we don't even notice this.

First commit fixes the above problem by only persisting registration info in Redis and not in kas' RAM. That way those k->v mappings in Redis are not refreshed as part of periodic refresh. They are GCed if they expire, as today.


The second commit hardcodes the registration TTL since it must be consistent across agentk and kas and hence cannot be configurable. agentk calls Register() every 5 minutes. If kas is configured with TTL of 1 minute then metrics would be incorrect all the time since registrations would expire after a minute and will be wrong for the next 4 minutes.

Merge request reports