Skip to content

Enable transactions for Gitaly's CleanupService, ObjectPoolService, RefService, RemoteService, WikiService

Production Change

Change Summary

In order to achieve strong consistency in Gitaly, we have introduced transactions via Praefect. All Gitaly nodes taking part in a transaction will perform a vote on what they think the result of a given git operation should be -- if the vote succeeds, they commit the git operation, otherwise it's rejected.

Transactions have been tested during the last few months as a subset of our RPCs have them unconditionally enabled right now. Yesterday transactions were enabled for the RemoteService (#3431 (closed)) without any issue. Today, there's the second batch of RPCs:

  • CleanupService/ApplyBfgObjectMapStream: apply cleanups via BFG
  • ObjectPoolService/FetchIntoObjectPool: fetch changes from pool member into an object pool
  • RefService/DeleteRefs: delete references
  • RemoteService/AddRemote: add a remote
  • RemoteService/FetchInternalRemote: fetch an internal remote
  • RemoteService/RemoveRemote: remove a remote
  • RemoteService/UpdateRemoteMirror: mirror changes into a remote repository
  • WikiService/WikiDeletePage: delete a wiki page
  • WikiService/WikiUpdatePage: update a wiki page
  • WikiService/WikiWritePage: write a wiki page

Note that this only has an effect for repositories which are hosted by Praefect. This currently includes the gitlab-org group and a few thousand other repos. The feature flags have all been enabled in staging since December 8th.

Change Details

  1. Services Impacted - Gitaly
  2. Change Technician - @pks-t
  3. Change Criticality - C4
  4. Change Type - changescheduled
  5. Change Reviewer -
  6. Due Date - February 1st, 10:30 UTC
  7. Time tracking - 15 minutes
  8. Downtime Component - none

Detailed steps for the change

Change Steps - steps to take to execute the change

Estimated Time to Complete (mins) - 1 minute per RPC

  • CleanupService/ApplyBfgObjectMapStream: /chatops run feature set gitaly_tx_apply_bfg_object_map_stream true
  • ObjectPoolService/FetchIntoObjectPool: /chatops run feature set gitaly_tx_fetch_into_object_pool true
  • RefService/DeleteRefs: /chatops run feature set gitaly_tx_delete_refs true
  • RemoteService/AddRemote: /chatops run feature set gitaly_tx_add_remote true
  • RemoteService/FetchInternalRemote: /chatops run feature set gitaly_tx_fetch_internal_remote true
  • RemoteService/RemoveRemote: /chatops run feature set gitaly_tx_remove_remote true
  • RemoteService/UpdateRemoteMirror: /chatops run feature set gitaly_tx_update_remote_mirror true
  • WikiService/WikiDeletePage: /chatops run feature set gitaly_tx_wiki_delete_page true
  • WikiService/WikiUpdatePage: /chatops run feature set gitaly_tx_wiki_update_page true
  • WikiService/WikiWritePage: /chatops run feature set gitaly_tx_wiki_write_page true

Post-Change Steps - steps to take to verify the change

Estimated Time to Complete (mins) - 5 minutes per RPC

Rollback

Rollback steps - steps to be taken in the event of a need to rollback this change

Estimated Time to Complete (mins) - 1 minute per RPC

  • CleanupService/ApplyBfgObjectMapStream: /chatops run feature set gitaly_tx_apply_bfg_object_map_stream false
  • ObjectPoolService/FetchIntoObjectPool: /chatops run feature set gitaly_tx_fetch_into_object_pool false
  • RefService/DeleteRefs: /chatops run feature set gitaly_tx_delete_refs false
  • RemoteService/AddRemote: /chatops run feature set gitaly_tx_add_remote false
  • RemoteService/FetchInternalRemote: /chatops run feature set gitaly_tx_fetch_internal_remote false
  • RemoteService/RemoveRemote: /chatops run feature set gitaly_tx_remove_remote false
  • RemoteService/UpdateRemoteMirror: /chatops run feature set gitaly_tx_update_remote_mirror false
  • WikiService/WikiDeletePage: /chatops run feature set gitaly_tx_wiki_delete_page false
  • WikiService/WikiUpdatePage: /chatops run feature set gitaly_tx_wiki_update_page false
  • WikiService/WikiWritePage: /chatops run feature set gitaly_tx_wiki_write_page false

Monitoring

Key metrics to observe

Changes checklist

  • This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities.
  • This issue has the change technician as the assignee.
  • Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed.
  • Necessary approvals have been completed based on the Change Management Workflow.
  • Change has been tested in staging and results noted in a comment on this issue.
  • A dry-run has been conducted and results noted in a comment on this issue.
  • SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall and this issue and await their acknowledgement.)
  • There are currently no active incidents.
Edited by Patrick Steinhardt