Re-balance file-28,26,23,27 gitaly node repositories
C2
Production Change - Criticality 2Change Objective | Prevent a gitaly node from running out of space |
---|---|
Change Type | C2 |
Services Impacted | Gitaly |
Change Team Members | @cmcfarland |
Change Severity | ~S2 |
Buddy check | A colleague will review the change |
Tested in staging | This processes has been tested, but testing this in staging is not possible |
Schedule of the change | TBD - See Comments |
Duration of the change | Several hours, but unknown duration |
Detailed steps for the change. Each step must include: | See below |
These steps will be performed for the following migrations:
file-28 -> file-31
file-26 -> file-30
file-23 -> file-32
file-27 -> file-29
-
Install the migration script on a common system (console server). -
Dry run the migration script and look for problems. -
Execute the migration script in a tmux session on the console server during low utilization time period.
time gitlab-rails runner /tmp/storage_rebalance.rb --current-file-server nfs-file22 --target-file-server nfs-file30 --dry-run true --wait 10800 --move-amount 1000 2>&1 | tee "migration.$(date +%Y-%m-%d_%H:%M).log"
-
Review any timed out transactions and restore/repair any repositories to their proper writable status. -
Create a list of moved repositories to delete on file-22.
find /var/opt/gitlab/git-data/repositories/@hashed -mindepth 2 -maxdepth 3 -name *+moved*.git > files_to_remove.txt
< files_to_remove.txt xargs du -ch | tail -n1
-
Have another SRE review the files to be removed to avoid loss of data. -
Create GCP snapshot of disk on file-22. -
Take a before df to show before disk space in use df -h /dev/sdb
-
Remove the files < files_to_remove.txt xargs -rn1 ionice -c 3 rm -fr
-
Take an after df to show after disk space in use df -h /dev/sdb
Edited by Cameron McFarland