When setting sidekiq.metrics.enabled == false, sidekiq doesn't start up
Summary
When metrics are disabled on the sidekiq pods, this prevents the liveness and readiness probes from properly working. This is due to the fact that we disable the port from being exposed in the deployment when the configuration is set to false: https://gitlab.com/gitlab-org/charts/gitlab/-/blob/4fc17f462001da727b9d1a577053dbfcaac3fd72/charts/gitlab/charts/sidekiq/templates/deployment.yaml#L163
Without that port definition our probes have nothing to properly validate the state of the Pod: https://gitlab.com/gitlab-org/charts/gitlab/-/blob/4fc17f462001da727b9d1a577053dbfcaac3fd72/charts/gitlab/charts/sidekiq/templates/deployment.yaml#L227-230
Steps to reproduce
Disable metrics on the sidekiq deployment, watch sidekiq sit in a crashLoop.
Configuration used
gitlab:
sidekiq:
metrics:
enabled: false
Current behavior
Sidekiq never settles into a Ready state.
NAMESPACE NAME READY STATUS RESTARTS AGE
default a-sidekiq-all-in-1-v1-76846bcf96-sf8gj 0/1 Running 1 14m
Expected behavior
Sidekiq should work.
Workaround
- Don't disable the metrics for sidekiq.
Questions
- Since we allow the disablement of metrics on the sidekiq deployment, what should we do with readiness and liveness probes?
- SHOULD we allow for the disablement of metrics for sidekiq?
- Are there other services where this configuration style is present that we've not yet come across?
- Does disabling the deployment of Prometheus also disable the metrics endpoint?
Relevant logs
0s Warning Unhealthy pod/a-sidekiq-all-in-1-v1-649bff8584-xjjv8 Readiness probe failed: Get http://172.17.0.8:3807/readiness: dial tcp 172.17.0.8:3807: connect: connection refused
0s Warning Unhealthy pod/a-sidekiq-all-in-1-v1-649bff8584-xjjv8 Liveness probe failed: Get http://172.17.0.8:3807/liveness: dial tcp 172.17.0.8:3807: connect: connection refused
Reference:
/cc @ricardofbarros /cc @WarheadsSE