Change Request [PRD] Run rake task `gitlab:cleanup:delete_orphan_job_artifact_final_objects`
Production Change
Change Summary
This is the next step after we listed all orphan artifact objects in https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17579.
This is a request to execute the gitlab:cleanup:delete_orphan_job_artifact_final_objects
rake task (added in gitlab-org/gitlab!146093 (merged)) on production.
What does the rake task do?
In https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17579, we executed the gitlab:cleanup:list_orphan_job_artifact_final_objects
rake task to identify the orphan artifact objects. The rake task listed them in a CSV file (/opt/gitlab/embedded/service/gitlab-rails/tmp/orphan_job_artifact_final_objects.csv
).
Due to the long running nature of the rake task (depending on the size of the CSV list), we added a "resume from last page marker" functionality to it. So if ever the rake task is abruptly interrupted (e.g. the node is killed) after already processing hundreds of objects in the list, it will resume from the last known file cursor once we re-run it again.
Here's how to execute the rake task:
If FILENAME
is not specified, the rake task will look for a CSV file named orphan_job_artifact_final_objects.csv
:
> rake 'gitlab:cleanup:delete_orphan_job_artifact_final_objects'
If we want to use a custom filename, or if the file is located outside of the gitlab rails directory, we can do:
> FILENAME='some/path/custom_filename.csv' rake 'gitlab:cleanup:delete_orphan_job_artifact_final_objects'
As the rake task processes orphan objects and deletes them one by one, it also logs the deleted objects to a separate CSV file. The filename of the list of deleted objects is in the format of deleted_from--<filename_of_the_base_csv>
. So for example if FILENAME
was some/path/custom_filename.csv
, we will have some/path/deleted_from--custom_filename.csv
. This separate file is used for emergency purpose if something goes wrong and we need to rollback the deletions. We have a separate rake task that will process this list if ever. More info about it on the rollback section.
Please note that when we re-run the rake task and there's an existing file for the list of deleted objects, it will just append new entries into the file. If no existing file is found, the rake task will create a new file and then continue adding deleted entries into it.
The generated list of deleted objects doesn't have headers, and will have entries with 3 comma-separated values that will look like:
35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/1a/1a/5abfa4ec66f1cc3b681a4d430b8b04596cbd636f13cdff44277211778f26,201,1711616743796587
The first value is the object path, the second value is the object size, and the third one is the last known generation/version of the deleted object.
Sample rake task output:
Click to expand
I, [2024-04-02T00:07:36.690721 #213] INFO -- : Processing orphan_job_artifact_final_objects.csv...
I, [2024-04-02T00:07:36.693097 #213] INFO -- : No last cursor position found, starting from beginning.
I, [2024-04-02T00:07:40.006406 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/0a/9c/66f4284c21a6cf6d51ecca8334522d49233d6501c5033a904cb1930c76a0 (201 bytes)
I, [2024-04-02T00:07:40.336193 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/14/b4/887a988ab3c97fef27a8a2d2b172bfc49f042d79a1296795c91d1ecd8e3b (201 bytes)
I, [2024-04-02T00:07:40.458631 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/28/0b/eb705c309aa41c53428b08ec24c01cf294981db3bb8b63345d8898dea9b6 (201 bytes)
I, [2024-04-02T00:07:40.573119 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/2e/ca/cc269beba0591b258d7d63b1dac738a78375e35e88ff055c6a1c2fee40e6 (201 bytes)
I, [2024-04-02T00:07:40.700862 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/3a/bd/34ec6bb5e44aff3550a8b38ffa424f16261c118869ccfa64d1b1b98892c2 (201 bytes)
I, [2024-04-02T00:07:40.701850 #213] INFO -- : Saved current cursor position: 745
I, [2024-04-02T00:07:43.999245 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/4b/78/2377f3bb1d9f32fd53aa5d3940c9964b183429f1d81c1ff606a7f89d7b13 (201 bytes)
I, [2024-04-02T00:07:44.122188 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/53/c5/78bb489733f62ec9690379b02744704398c4413148a682232b2556e7fc17 (201 bytes)
I, [2024-04-02T00:07:44.228025 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/56/70/4a4e5b68da53d50a676f183a6c6f045e55fb135ab654561f7bfea98fce2e (201 bytes)
I, [2024-04-02T00:07:44.338502 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/73/82/edb85f423ff130436f2868f6da7ca1eb4a97b8927a51e86cd1befb61f6b4 (201 bytes)
I, [2024-04-02T00:07:44.469596 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/7c/42/80b5bfa69392722e6a757d578a52dca12597956226d56a3c8794f9049e21 (201 bytes)
I, [2024-04-02T00:07:44.471887 #213] INFO -- : Saved current cursor position: 1490
I, [2024-04-02T00:07:47.235749 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/82/eb/6ca46c9710eead0ce34be5ef71673a00effdebe921f6926216cf0525b606 (201 bytes)
I, [2024-04-02T00:07:47.356105 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/94/6d/59d67571fcc6cf03c36be0a14268d4fcb09d15bbd459e5cc26c23cb06389 (201 bytes)
I, [2024-04-02T00:07:47.470968 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/94/8d/5b506c487d346c916544fdc25d05f7c3815689689a22cbd4855be8fe65f5 (201 bytes)
I, [2024-04-02T00:07:47.619308 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/9b/39/a3f1ef5539bfb5998f2b6cf5378d42f10709bb33911d59436e0f5641735e (201 bytes)
I, [2024-04-02T00:07:47.968405 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/9d/74/2852dc3abd4c79ff51a669533c1842f137f05156480a9833956b1353bd89 (201 bytes)
I, [2024-04-02T00:07:47.971062 #213] INFO -- : Saved current cursor position: 2235
I, [2024-04-02T00:07:51.287346 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/a6/d8/c8b6098b9a3217aa8df331c52031618f1d680096230eb8d526d83b6b6e28 (201 bytes)
I, [2024-04-02T00:07:51.406300 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/a8/d6/ece83a3a0fdf9aa7c39f64c9fc008acd87cc07464024b36b10410c2cbed1 (201 bytes)
I, [2024-04-02T00:07:51.512683 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/c9/9e/adb50d75bddf3d40416d8aeb153da2a476b25e4bbe4e6af10b6cedc29702 (201 bytes)
I, [2024-04-02T00:07:51.628065 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/d0/f5/13e10c65af876e49963c9368dbae7a1e1cd601134096d1ed1aaf72ca2f0d (201 bytes)
I, [2024-04-02T00:07:51.811070 #213] INFO -- : Deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/ea/3f/a644943797e33707a7ce6b9dd4fb9e5d953a179d8a9ae90a78002f838e95 (201 bytes)
I, [2024-04-02T00:07:51.811785 #213] INFO -- : Saved current cursor position: 2980
I, [2024-04-02T00:07:51.812348 #213] INFO -- : Done. All deleted objects are listed in deleted_from--orphan_job_artifact_final_objects.csv.
Change Details
- Services Impacted - ServiceGoogleCloudStorage
- Change Technician - @iamricecake, SRE to execute the rake task TBD
- Change Reviewer - @igorwwwwwwwwwwwwwwwwwwww
- Time tracking - Hours
- Downtime Component - N/A
Detailed steps for the change
Change Steps - steps to take to execute the change
NOTE: Execute all these first on staging.
Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes
-
Set label changein-progress /label ~change::in-progress
-
Run the rake task on the console-01-sv-gprd.c.gitlab-production.internal
host.- Since this is a potentially long-running task, we should run it inside of a
screen
ortmux
session. -
run sudo -u git env FILENAME='/opt/gitlab/embedded/service/gitlab-rails/tmp/orphan_job_artifact_final_objects.csv' gitlab-rake 'gitlab:cleanup:delete_orphan_job_artifact_final_objects'
- In the case the rake task is abruptly interrupted, we just re-execute the same command and it will resume from last known file cursor, appending newly deleted objects to the previously generated
deleted_from--orphan_job_artifact_final_objects.csv
(default filename).
- In the case the rake task is abruptly interrupted, we just re-execute the same command and it will resume from last known file cursor, appending newly deleted objects to the previously generated
- Ensure that the rake task finished completely by checking that it printed out
Done
. - Ensure there is the file generated
deleted_from--orphan_job_artifact_final_objects.csv
in thetmp
folder. - Keep the file for later use just in case we need to rollback the deletion.
- Since this is a potentially long-running task, we should run it inside of a
-
Set label changecomplete /label ~change::complete
Rollback
Rollback steps - steps to be taken in the event of a need to rollback this change
In an unexpected event that something goes wrong due to deleting objects, we can rollback the deleted ones.
How to run the rollback rake task:
sudo -u git env FILENAME='/opt/gitlab/embedded/service/gitlab-rails/tmp/deleted_from--orphan_job_artifact_final_objects.csv' gitlab-rake 'gitlab:cleanup:rollback_deleted_orphan_job_artifact_final_objects'
This will restore deleted objects to the saved generation values in the CSV.
Sample output:
Click to expand
I, [2024-04-02T00:37:37.325905 #2392] INFO -- : Processing deleted_from--orphan_job_artifact_final_objects.csv...
I, [2024-04-02T00:37:37.327973 #2392] INFO -- : No last cursor position found, starting from beginning.
I, [2024-04-02T00:37:38.750753 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/0a/9c/66f4284c21a6cf6d51ecca8334522d49233d6501c5033a904cb1930c76a0 to generation 1711616739755059
I, [2024-04-02T00:37:38.752243 #2392] INFO -- : Saved current cursor position: 166
I, [2024-04-02T00:37:39.566428 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/14/b4/887a988ab3c97fef27a8a2d2b172bfc49f042d79a1296795c91d1ecd8e3b to generation 1711616740487864
I, [2024-04-02T00:37:39.567106 #2392] INFO -- : Saved current cursor position: 332
I, [2024-04-02T00:37:40.242750 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/28/0b/eb705c309aa41c53428b08ec24c01cf294981db3bb8b63345d8898dea9b6 to generation 1711616741181104
I, [2024-04-02T00:37:40.243715 #2392] INFO -- : Saved current cursor position: 498
I, [2024-04-02T00:37:40.835241 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/2e/ca/cc269beba0591b258d7d63b1dac738a78375e35e88ff055c6a1c2fee40e6 to generation 1711616741904817
I, [2024-04-02T00:37:40.835951 #2392] INFO -- : Saved current cursor position: 664
I, [2024-04-02T00:37:41.531578 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/3a/bd/34ec6bb5e44aff3550a8b38ffa424f16261c118869ccfa64d1b1b98892c2 to generation 1711985733917213
I, [2024-04-02T00:37:41.532086 #2392] INFO -- : Saved current cursor position: 830
I, [2024-04-02T00:37:42.024401 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/4b/78/2377f3bb1d9f32fd53aa5d3940c9964b183429f1d81c1ff606a7f89d7b13 to generation 1711616742606343
I, [2024-04-02T00:37:42.026683 #2392] INFO -- : Saved current cursor position: 996
I, [2024-04-02T00:37:42.494218 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/53/c5/78bb489733f62ec9690379b02744704398c4413148a682232b2556e7fc17 to generation 1711985762443278
I, [2024-04-02T00:37:42.495893 #2392] INFO -- : Saved current cursor position: 1162
I, [2024-04-02T00:37:42.952732 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/56/70/4a4e5b68da53d50a676f183a6c6f045e55fb135ab654561f7bfea98fce2e to generation 1711616743311151
I, [2024-04-02T00:37:42.952971 #2392] INFO -- : Saved current cursor position: 1328
I, [2024-04-02T00:37:43.634809 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/73/82/edb85f423ff130436f2868f6da7ca1eb4a97b8927a51e86cd1befb61f6b4 to generation 1711985741007525
I, [2024-04-02T00:37:43.635095 #2392] INFO -- : Saved current cursor position: 1494
I, [2024-04-02T00:37:44.438048 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/7c/42/80b5bfa69392722e6a757d578a52dca12597956226d56a3c8794f9049e21 to generation 1711959334851138
I, [2024-04-02T00:37:44.438995 #2392] INFO -- : Saved current cursor position: 1660
I, [2024-04-02T00:37:44.951492 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/82/eb/6ca46c9710eead0ce34be5ef71673a00effdebe921f6926216cf0525b606 to generation 1711985748158900
I, [2024-04-02T00:37:44.952330 #2392] INFO -- : Saved current cursor position: 1826
I, [2024-04-02T00:37:45.481900 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/94/6d/59d67571fcc6cf03c36be0a14268d4fcb09d15bbd459e5cc26c23cb06389 to generation 1711985754952829
I, [2024-04-02T00:37:45.484499 #2392] INFO -- : Saved current cursor position: 1992
I, [2024-04-02T00:37:45.985180 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/94/8d/5b506c487d346c916544fdc25d05f7c3815689689a22cbd4855be8fe65f5 to generation 1711985658409855
I, [2024-04-02T00:37:45.985917 #2392] INFO -- : Saved current cursor position: 2158
I, [2024-04-02T00:37:46.406216 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/9b/39/a3f1ef5539bfb5998f2b6cf5378d42f10709bb33911d59436e0f5641735e to generation 1711985771208802
I, [2024-04-02T00:37:46.406862 #2392] INFO -- : Saved current cursor position: 2324
I, [2024-04-02T00:37:47.150244 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/9d/74/2852dc3abd4c79ff51a669533c1842f137f05156480a9833956b1353bd89 to generation 1711616744241924
I, [2024-04-02T00:37:47.153147 #2392] INFO -- : Saved current cursor position: 2490
I, [2024-04-02T00:37:47.713330 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/a6/d8/c8b6098b9a3217aa8df331c52031618f1d680096230eb8d526d83b6b6e28 to generation 1711985522192036
I, [2024-04-02T00:37:47.714177 #2392] INFO -- : Saved current cursor position: 2656
I, [2024-04-02T00:37:48.210059 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/a8/d6/ece83a3a0fdf9aa7c39f64c9fc008acd87cc07464024b36b10410c2cbed1 to generation 1711616744665833
I, [2024-04-02T00:37:48.210855 #2392] INFO -- : Saved current cursor position: 2822
I, [2024-04-02T00:37:48.728336 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/c9/9e/adb50d75bddf3d40416d8aeb153da2a476b25e4bbe4e6af10b6cedc29702 to generation 1711985726270726
I, [2024-04-02T00:37:48.729460 #2392] INFO -- : Saved current cursor position: 2988
I, [2024-04-02T00:37:49.280418 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/d0/f5/13e10c65af876e49963c9368dbae7a1e1cd601134096d1ed1aaf72ca2f0d to generation 1711985590976109
I, [2024-04-02T00:37:49.281159 #2392] INFO -- : Saved current cursor position: 3154
I, [2024-04-02T00:37:49.782883 #2392] INFO -- : Rolled back deleted object 35/13/35135aaa6cc23891b40cb3f378c53a17a1127210ce60e125ccf03efcfdaec458/@final/ea/3f/a644943797e33707a7ce6b9dd4fb9e5d953a179d8a9ae90a78002f838e95 to generation 1711616745077695
I, [2024-04-02T00:37:49.784920 #2392] INFO -- : Saved current cursor position: 3320
I, [2024-04-02T00:37:49.785650 #2392] INFO -- : Done. Rolled back deleted objects listed in deleted_from--orphan_job_artifact_final_objects.csv.
Monitoring
We may need to observe if we might get throttled or rejected by GCP due to making too many requests in a short amount of time. If this happens, we will need to modify the rake task to add a delay in between requests.
Change Reviewer checklist
-
Check if the following applies: - The scheduled day and time of execution of the change is appropriate.
- The change plan is technically accurate.
- The change plan includes estimated timing values based on previous testing.
- The change plan includes a viable rollback plan.
- The specified metrics/monitoring dashboards provide sufficient visibility for the change.
-
Check if the following applies: - The complexity of the plan is appropriate for the corresponding risk of the change. (i.e. the plan contains clear details).
- The change plan includes success measures for all steps/milestones during the execution.
- The change adequately minimizes risk within the environment/service.
- The performance implications of executing the change are well-understood and documented.
- The specified metrics/monitoring dashboards provide sufficient visibility for the change.
- If not, is it possible (or necessary) to make changes to observability platforms for added visibility?
- The change has a primary and secondary SRE with knowledge of the details available during the change window.
- The change window has been agreed with Release Managers in advance of the change. If the change is planned for APAC hours, this issue has an agreed pre-change approval.
- The labels blocks deployments and/or blocks feature-flags are applied as necessary.
Change Technician checklist
-
Check if all items below are complete: - The change plan is technically accurate.
- This Change Issue is linked to the appropriate Issue and/or Epic
- Change has been tested in staging and results noted in a comment on this issue.
- A dry-run has been conducted and results noted in a comment on this issue.
- The change execution window respects the Production Change Lock periods.
- For C1 and C2 change issues, the change event is added to the GitLab Production calendar.
- For C1 and C2 change issues, the SRE on-call has been informed prior to change being rolled out. (In #production channel, mention
@sre-oncall
and this issue and await their acknowledgement.) - For C1 and C2 change issues, the SRE on-call provided approval with the eoc_approved label on the issue.
- For C1 and C2 change issues, the Infrastructure Manager provided approval with the manager_approved label on the issue.
- Release managers have been informed prior to any C1, C2, or blocks deployments change being rolled out. (In #production channel, mention
@release-managers
and this issue and await their acknowledgment.) - There are currently no active incidents that are severity1 or severity2
- If the change involves doing maintenance on a database host, an appropriate silence targeting the host(s) should be added for the duration of the change.