Provision service may overwrite namespace attributes with undesired parameters when executing for an identical namespace
Summary
It seems that when multiple provisions are scheduled to run at the same time for the same namespace ID, nothing blocks one provision from overwriting the results of other provision(s), potentially resulting in an ambiguous end state.
During a recent renewal, an Ultimate tier group was left with a monthly compute quota of 400, immediately blocking pipelines due to an insufficient quota, triggering a customer emergency.
The renewal involved a debook/rebook, and therefore a cancelled subscription alongside a newly created subscription. During the queued provisioning jobs for both subscriptions, which executed within seconds of each other, the task Gitlab::Namespaces::UpdateService for the cancelled subscription executed after the one for the new subscription, resulting in the attribute shared_runners_minutes_limit being overwritten with the free tier value, 400, while leaving the group on Ultimate.
The provisions both ran at the start of the day UTC 2025-04-10, roughly 00:00
- New subscription Order: https://customers.gitlab.com/admin/order/1033149
- related provision: https://customers.gitlab.com/admin/provision/182881
- Old subscription Order: https://customers.gitlab.com/admin/order/709844
- related provision: https://customers.gitlab.com/admin/provision/182349
- notice this is
skippedand the checkpoints are from 2025-04-03, but theupdated_attimestamp matches the timeframe here
The series of log events for provision 182881 (new subscription):
https://cloudlogging.app.goo.gl/Rtqs5hJeuoeKKLWRA
Updating Namespace occurs at 2025-04-10T00:00:07.035Z, with these attributes:
attributes: {
additional_purchased_storage_ends_on: null
additional_purchased_storage_size: 0
event_type: "sync"
plan: "ultimate"
shared_runners_minutes_limit: 50000
}
The series of log events for provision 182349 (old subscription):
https://cloudlogging.app.goo.gl/a8RtWEAERD2mVxJd7
Updating Namespace occurs at 2025-04-10T00:00:12.874Z, with these attributes:
attributes: {
additional_purchased_storage_ends_on: null
additional_purchased_storage_size: 0
event_type: "sync"
plan: null
shared_runners_minutes_limit: 400
}
It's interesting that this second event includes plan: null, while the end result was the namespace having both plan "ultimate" and shared_runners_minutes_limit 400, evidence of which is included in the ticket.
Workarounds
I manually resynced the new subscription via admin tooling
Reported examples
Support Priority Score: (1, -, -, -, -, -, 3, -, 3, -, 3) => 10