Security policy bot creation/removal race condition

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Why are we doing this work

There is a race condition that can leave policy bots orphaned or erroneously removed. The race condition results from OrchestrationConfigurationRemoveBotWorker not accepting the bot user ID as a parameter we expect to delete.

Let's say we have a group with 9 contained projects. Initially the group has a configuration linked. None of the projects have any configurations. We unassign the group's configuration, and as a result 9 jobs for OrchestrationConfigurationRemoveBotWorker get enqueued. Now we assign a configuration to one of the projects, which enqueues OrchestrationConfigurationCreateBotWorker once. Let's say Sidekiq runs with 10 workers and there was a queue backlog. Eventually Sidekiq pops the 10 jobs (9x removal, 1x creation). If the bot user survives this sequence depends on job execution order, since the removal job gets dequeued before the creation job, but not necessarily executed before it.

We would need to pass the bot user ID to avoid this, or guard both workers with a lock.

Relevant links

n/a

Non-functional requirements

  • Documentation:
  • Feature flag:
  • Performance:
  • Testing:

Implementation plan

Verification steps

n/a

Edited by 🤖 GitLab Bot 🤖