Add chef-client disabler scripts to Chef-managed hosts
<!-- Please review https://about.gitlab.com/handbook/engineering/infrastructure/change-management/ for the most recent information on our change plans and exection policies. --> # Production Change - Criticality 4 ~"C4" | Change Objective | Install the `chef-client-disable` and `chef-client-enable` scripts on Chef managed hosts | |:-------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------| | Change Type | ConfigurationChange | | Services Impacted | None (adds utility scripts for chef-client) | | Change Team Members | @msmiley | | Change Criticality | ~C4 | | Change Reviewer or tested in staging | tested on staging environment (see https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1898#note_317510726) | | Due Date | 2020-04-06 21:00 UTC (14:00 PDT) | | Time tracking | 30 minutes ( same to rollback ) | ## Detailed steps for the change ### Pre-condition Merge the chef-repo MR to bump the version of cookbook `gitlab-server` to 1.8.1 in the production environment: https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/3065 ### Change steps Run the `apply_to_prod` job of the pipeline for the MR mentioned above. ### Validation steps Manually run chef-client on any example Chef-managed host in the target environment. Then verify the scripts and symlinks are present in /usr/local/bin: ```shell $ sudo chef-client ... $ ls -l /usr/local/bin/chef-client-* lrwxrwxrwx 1 root root 40 Apr 3 19:06 /usr/local/bin/chef-client-disable -> /usr/local/bin/chef-client-disabler-shim -rwxr-xr-x 1 root root 6496 Apr 3 19:06 /usr/local/bin/chef-client-disabler-shim -rwxr-xr-x 1 root root 5627 Apr 3 19:06 /usr/local/bin/chef-client-disabler-shim-test lrwxrwxrwx 1 root root 40 Apr 3 19:06 /usr/local/bin/chef-client-enable -> /usr/local/bin/chef-client-disabler-shim ``` Optionally, run the acceptance tests. ```shell $ chef-client-disabler-shim-test ``` Optionally, use the scripts as they would be run in practice, per the [updated runbook](https://gitlab.com/gitlab-com/runbooks/-/blob/b39e0f93dfe008ae4618dd4dd3d83ecf33004638/docs/uncategorized/disable-chef-runs-on-a-vm.md): ```shell # Disable chef-client. $ chef-client-disable 'Testing chef-client-disable script, see issue https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1898' # Show that periodic runs are disabled. $ sudo systemctl is-active chef-client.service $ sudo systemctl is-enabled chef-client.service # Show that manual runs are disabled. $ sudo chef-client # Re-enable chef-client. $ chef-client-enable # Show that chef-client behavior is restored to normal. $ sudo chef-client ``` <!-- For each step the following must be considered: * pre-conditions for execution of the step - how to verify it is safe to proceed * execution commands for the step - what to do * post-execution validation for the step - how to verify the step succeeded It is strongly recommended to: * Note relevant graphs in grafana to monitor the effect of the change, including how to identify that it has worked, or has caused undue negative effects * Review alerts that may go off that can be silenced pro-actively --> ## Rollback steps Because this change adds script that are only run manually by humans, if rollback is needed, it should not need to be rushed. Revert the MR, and re-run chef-client. If chef-client has been disabled using these scripts, it can be manually re-enabled as follows: ```shell $ sudo rm -v /usr/bin/chef-client && sudo ln -s /opt/chef/bin/chef-client /usr/bin/chef-client $ sudo systemctl enable chef-client.service $ sudo systemctl start chef-client.service ``` <!-- * As for the original steps. It may be acceptable to reference the change steps as the process, with variations (e.g. Revert commit and run deployment). * It is acceptable to list a full rollback process, and allow for the applier to select where to start based on how far through they got. --> ## Changes checklist <!-- Before commencing work, inform the person on-call at minimum. To find out who is on-call, in #production channel run: /chatops run oncall production. --> - [x] Detailed steps and rollback steps have been filled prior to commencing work - [x] Person on-call has been informed prior to change being rolled out
issue