How to structure Sidekiq charts for splitting queues

While working on #38 (closed) and creating the container, the question of how to structure the charts got a bit complicated. How exactly will we structure the chart in order to be able to handle the splitting of the queues into groups, but not have them exhaust resources in a single Pod on a single Node of a K8S cluster? If we have all instances as Containers in a single Deployment, how will that handle scaling? Incrementing Replicas will result in more Pods being allocated, hopefully on different Nodes. The question comes down to if it makes sense to split the queues into separate Deployments so that we can scale the workers on a specific set of queues, and not all workers.

Resources

GitLab has 59 queues, spread across increasing levels, where higher values are of higher precedence (1,2,3,5). How should we break these down, based upon the experiences from the our production deployment on GitLab.com? Are any of these jobs more resource intensive than others on CPU, memory, etc?

Lastly, the documentation for Sidekiq configuration states:

I don't recommend having more than a handful of queues. Lots of queues makes for a more complex system and Sidekiq Pro cannot reliably handle multiple queues without polling. M Sidekiq Pro processes polling N queues means O(M*N) operations per second slamming Redis.

The observations made by the Sidekiq developers give credence to our choice to support multiple Redis instances, which is not currently a work item on this cycle.

Templating

Depending on the answer from above, we'll need to look at how to dynamically create multiple Containers or Deployments based on the way the configuration declares how to split the queues. Off hand, I think it makes sense to have this handled based on a hash-map from the configuration, as exampled below, which would then result in the creation of multiple named Deployments and thus splitting the Pods' load at the time of creation. Each Deployment would then have a separate ConfigMap passed to it in order to provide the configuration for the specific worker set. In the case of the below example, there would be xx-sidekiq-workflow, xx-sidekiq-pipeline, and so on. The question comes down to a method to actually do this, as range walking inside the deployment.yaml and configmap.yaml will prove complicated, as we may need to also extend this to include differences in resources or nodeSelector definitions.

gitlab:
  sidekiq:
    pods:
      - name: workflow
        replicas: 2
        queues:
          - [post_receive, 5]
          - [merge, 5]
          - [update_merge_requests, 3]
          - [process_commit, 3]
          - [new_note, 2]
          - [new_issue, 2]
          - [new_merge_request, 2]
          - [build, 2]
      - name: pipeline
        replicas: 2
        queues:
          - [pipeline, 2]
          - [pipeline_processing, 5]
          - [pipeline_default, 3]
          - [pipeline_cache, 3]
          - [pipeline_hooks, 2]
      - name: email
        replicas: 1
        queues:
          - [email_receiver, 2]
          - [emails_on_push, 2]
          - [mailers, 2]

~~## Sidekiq Cluster~~

As I have never set this up myself, I need a better understanding of it as a whole to understand the impact on the above question. Aside from that, as Sidekiq Cluster is an EE item, we'll need to handle that as an on/off feature after we've made the EE charts complete.

Edited Nov 21, 2017 by Jason Plum