Increase number of Ansible forks used by deployment pipelines
Ansible has a configuration parameter called ANSIBLE_FORKS
that controls how many tasks it executes in parallel.
The value of this configuration parameter is controlled by the ANSIBLE_NUM_FORKS
environment variable in https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/settings/ci_cd. This controls the number of Ansible forks in all pipelines executed in Deployer.
Executing more tasks in parallel means that the overall operation takes less time. But executing more in parallel also requires more computing resources.
Increasing the number of Ansible forks used by deployment pipelines has the potential to make the pipelines much faster. But we need to do it carefully in order to make sure we don't hit any resource limits in the runners.
@skarbek had started increasing the number of forks some time back, and increased it from the default of 5 to 15. We saw major improvements in the coordinated pipeline duration due to this change.
Action in progress
We are currently increasing ANSIBLE_NUM_FORKS
in steps of 1 per day and watching for any unusual behavior in deployment pipelines.
Log of changes to `ANSIBLE_NUM_FORKS`
Date | Change | Changer |
---|---|---|
2023-08-22 05:25 UTC | Increased ANSIBLE_NUM_FORKS to 17 |
Reuben |
2023-08-24 06:47 UTC | Increased ANSIBLE_NUM_FORKS to 18 |
Reuben |
2023-08-29 15:19 UTC | Increased ANSIBLE_NUM_FORKS to 19 |
Reuben |
2023-09-01 10:53 UTC | Increased ANSIBLE_NUM_FORKS to 20 |
Reuben |
2023-09-05 12:50 UTC | Increased ANSIBLE_NUM_FORKS to 21 |
Reuben |
2023-09-07 08:18 UTC | Increased ANSIBLE_NUM_FORKS to 22 |
Reuben |
2023-09-12 12:43 UTC | Increased ANSIBLE_NUM_FORKS to 23 |
Reuben |
2023-09-19 08:21 UTC | Increased ANSIBLE_NUM_FORKS to 24 |
Reuben |
2023-09-21 12:29 UTC | Increased ANSIBLE_NUM_FORKS to 25 |
Reuben |
2023-10-10 11:27 UTC | Increased ANSIBLE_NUM_FORKS to 26 |
Reuben |
2023-10-16 13:13 UTC | Increased ANSIBLE_NUM_FORKS to 31 |
Reuben |
2023-10-19 12:01 UTC | Increased ANSIBLE_NUM_FORKS to 36 |
Reuben |
2023-10-31 11:51 UTC | Increased ANSIBLE_NUM_FORKS to 41 |
Reuben |
2023-12-11 10:06 UTC | Increased ANSIBLE_NUM_FORKS to 51 |
Reuben |
2023-12-20 16:40 UTC | Increased ANSIBLE_NUM_FORKS to 61 |
Reuben |
2024-01-18 12:42 UTC | Increased ANSIBLE_NUM_FORKS to 70 |
Reuben |
2024-01-19 07:42 UTC | Increased ANSIBLE_NUM_FORKS to 61 |
Reuben |
Monitoring
We can monitor resource usage of runners using the following dashboards:
-
For non-k8s runners: https://dashboards.gitlab.net/d/bd2Kl9Imk/host-stats?orgId=1&var-env=ops&var-node=runner-release-01-inf-ops.c.gitlab-ops.internal&from=1692735235349&to=1692740070386
You can get the runner name from the job logs. Look for a log line like the following:
Running on runner-372d3b7c-project-151-concurrent-0 via runner-release-01-inf-ops
In this case,
runner-release-01-inf-ops
is the runner name. Paste it into theHost
select box in the dashboard and select the matching entry. -
I'm not sure if deployment pipelines use any K8s runners, but I'm just documenting this here in case they do.