Enable rolling update for runner managers for terraform changes
Problem
Every time we have to change machine definition for our runner managers terraform end up re-creating the runner managers by first deleting the machine and then creating it. This is not ideal because if we destroy a runner manager that is active it will terminate the jobs that it's running.
For example, updating the base OS image results into runner manager deletion
-
Merge request: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2794
-
Plan: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/jobs/4399398
Plan: 3 to add, 0 to change, 3 to destroy.This is because first, it deletes the machines and then created them again.
Goal
Able to add terraform changes without any downtime and easy/automated rollout.
Proposal
- Investigate usage of
lifecyclecreate_before_destroyso that we can first create the runner manager and then provision new set of runners without any downtime.
Edited by Steve Xuereb