Skip to content

[SaaS] Seat overage email schedules excessive number of jobs and API requests

This is a follow up from the discussion here: https://gitlab.com/gitlab-org/gitlab/-/issues/348487#note_900886862

Problem

Currently GitLab is making an API request to CustomersDot ~40K times daily to notify the customers app that a new member has been added to a group that has exceeded the number of purchased seats. This is happening in the GitlabSubscriptions::NotifySeatsExceededWorker. Only ~100 of these requests actually result in an email being sent to the group owners.

This many network requests and jobs may impact performance or delay other jobs.

This problem lead to the Sidekiq queues in CustomersDot getting inundated with jobs, SendSeatOverageNotificationJob continually enqu... (customers-gitlab-com#4934 - closed). In particular, we saw that one namespace had thousands of jobs and continued to enqueue more every few seconds. We put in a few quick fix (https://gitlab.com/gitlab-org/customers-gitlab-com/-/merge_requests/5581) to prevent requests for this namespace from enqueuing more Reconciliations::SendSeatOverageNotificationJob. We also added a CDot feature flag, block_seat_overage_notification, to easily enable this blocking logic. Once this issue is resolved, we should be able to roll back this logic in CDot and remove the feature flag as part of customers-gitlab-com#4948 (closed).

Proposal

We should flag the groups that have recently exceeded their purchased seat allowance and use a scheduled job to send this information to CustomersDot in batches, rather than individually.

A potential solution would be:

Note: a more in-depth discussion around this issue started here https://gitlab.com/gitlab-org/gitlab/-/issues/348487#note_900886862 if more context is needed.

Edited by Ryan Cobb