Do not run Redis keywatcher for non-Runner WH workloads
While monitoring CPU and memory during the Action Cable rollout I noticed that a significant portion of CPU was spent on observing Redis keys in keywatcher
; I am unfamiliar with this code but it appears to exist to support GitLab Runner; however, this system is not relevant in all of the contexts in which Workhorse executes. For instance, the websockets
fleet does not appear to require this, as the number of keywatchers is in fact 0 in all deployments except those of type api
: https://thanos-query.ops.gitlab.net/graph?g0.range_input=1h&g0.max_source_resolution=0s&g0.expr=sum%20by%20(type)%20(gitlab_workhorse_keywatcher_keywatchers)&g0.tab=0
Because it is connected to the same Redis cluster that have keyspace activity, however, all the notifications that are published to these channels are picked up in these WH instances and consume CPU and memory.
We should not run the keywatcher on Workhorses that do not service clients interested in these notifications.
Here is a CPU profile I pulled from stackprof from the websockets
fleet, which seems to indicate that 50% of average CPU time was spent processing Redis notifications that no-one is subscribed to:
We should consider to not enter the Process
function when there are no subscribers.