Leader election in agentk
agentk
may be running as more than one Pod
and for some functionality (e.g. GitOps) that would mean it would try to do the same work concurrently in each instance. This is not desirable:
- noise in logs
- extra memory consumption
- potential for race conditions
- extra load on server
Proposal
To fix that agentk
should do leader election using the recently introduced coordination.k8s.io
API.
Not all functionality should be guarded by leader election as some features are "passive" rather than "active" - i.e. only work when invoked via an API call. For "passive" features we want all instances of agentk
to be capable of servicing them.
Active:
- GitOps
- container security scanning
- ??
Passive:
- CI tunnel
- ??
We need to design an API that agentk exposes internally for agent modules to use so that the ones the need leader election can use it. If no modules require it (e.g. if all active modules are disabled), then agent shouldn't do leader election and should release the lock if it holds it at the moment (i.e. due to previous configuration when it was needed).
Here is a blog post, explaining how things work https://carlosbecker.com/posts/k8s-leader-election.