Create a GitLab Runner process wrapper
As per Add draft of the deployment mechanism design (gitlab-org/ci-cd/runner-tools/grit!90 - merged) • Tomasz Maczukin • 17.10 we want to have a way to control GitLab Runner process existence programmatically, so that we can automate triggering graceful shutdown and process status checking. However, that should be an optional setup and to not disrupt existing configurations, the main Runner process (started with gitlab-runner run) should work as it works today and the wrapper should be, well, a wrapper around that process
The idea is that we should implement a new runner command, for example gitlab-runner process-wrapper, that when started would:
- start a gRPC server at a given listening address (should be either placed in
config.tomlor configured by the CLI parameters; the latter seems better in this case) - start the
gitlab-runner runprocess under it's own control.
gitlab-runner run would then work as it works today - it would read the right configuration file etc. We should make sure that through gitlab-runner process-wrapper we can pass the CLI options that gitlab-runner run may need (so that in the future we can easily integrate it with the system service managers, like we do for gitlab-runner run today).
The gRPC server exposes two methods:
- Trigger for initiating graceful shutdown. When called, process manager should send
SIGQUITto the main runner process it owns. This will initiate the graceful shutdown on the process, but the wrapper process will still work. That is important for cases like deployments on Kubernetes, where we don't want to shutdown the pod (yet!) as this will instantly remove the K8S service and, for example, will drop the runner node from monitoring, while it still finishes jobs. - Check for the current runner process state. When runner is started it will return
running. When graceful shutdown was triggered but the inner process is still running - it will returnshutting_down. When the inner process is gone - it will returngone. With that the external tooling managing deployments may decide whether it's time to terminate the deployment entirely (e.g., terminate the K8S Pod in which runner was started) or not.
Optionally, the graceful shutdown trigger could accept a webhook URL. Once the shutdown is done and the inner process is terminated, this webhook URL would be called instructing the deployment tooling that termination was done and whatever next steps are planned, they can be now executed.