2023-02-07: Removing PREVENT_LOAD_BALANCER_RETRIES_IN_TRANSACTION env variable
<!-- Please review https://about.gitlab.com/handbook/engineering/infrastructure/change-management/ for the most recent information on our change plans and execution policies. --> :red_circle: This is dependent on the completion of https://gitlab.com/gitlab-org/gitlab/-/merge_requests/109108 :red_circle: - Done :white_check_mark: # Production Change ### Change Summary We introduced the usage of PREVENT_LOAD_BALANCER_RETRIES_IN_TRANSACTION ENV variable in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/90447, it's been in production for enough period that we have decided to cleanup the ENV variable (defaulting to _true_). Cleanup of the ENV usage is already done in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/109108 ### Change Details 1. **Services Impacted** - ~"Service::Web" ~"Service::API" ~"Service::GitLab Rails" 1. **Change Technician** - @praba.m7n 1. **Change Reviewer** - <!-- woodhouse: '@{{ .Reviewer }}' -->{+ DRI for the review of this change +} 1. **Time tracking** - Unknown 1. **Downtime Component** - N/A ## Detailed steps for the change ### Change Steps - steps to take to execute the change *Estimated Time to Complete (mins)* - {+Estimated Time to Complete in Minutes+} - [ ] Set label ~"change::in-progress" `/label ~change::in-progress` - [ ] Merge the MR (https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/merge_requests/2523) to remove the ENV in `gstg` and `gprd` ENVs - [ ] Set label ~"change::complete" `/label ~change::complete` ## Rollback > We should not be in this position since all the place where we refer to this ENV variable is [already removed](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/109108), so we don't have to re-introduce them. ### Rollback steps - steps to be taken in the event of a need to rollback this change - [ ] Contacting the DRI (@praba.m7n) and letting database team know about any incident by tagging (gitlab-org/database-team). - [ ] Set label ~"change::aborted" `/label ~change::aborted` ## Monitoring ### Key metrics to observe <!-- * Describe which dashboards and which specific metrics we should be monitoring related to this change using the format below. --> - Metric: rails_primary_sql SLI Apdex - Location: https://dashboards.gitlab.net/d/patroni-main/patroni-overview?orgId=1&viewPanel=2409561530&from=now-6h&to=now - What changes to this metric should prompt a rollback: Any drop in apdex - Metric: PostgreSQL Overview dashboard - Location: https://dashboards.gitlab.net/d/000000144/postgresql-overview?orgId=1&from=now-6h&to=now - What changes to this metric should prompt a rollback: Any deviation from the normal state ## Change Reviewer checklist <!-- To be filled out by the reviewer. --> ~C4 ~C3 ~C2 ~C1: - [x] Check if the following applies: - The **scheduled day and time** of execution of the change is appropriate. - The [change plan](#detailed-steps-for-the-change) is technically accurate. - The change plan includes **estimated timing values** based on previous testing. - The change plan includes a viable [rollback plan](#rollback). - The specified [metrics/monitoring dashboards](#key-metrics-to-observe) provide sufficient visibility for the change. ~C2 ~C1: - [ ] Check if the following applies: - The complexity of the plan is appropriate for the corresponding risk of the change. (i.e. the plan contains clear details). - The change plan includes success measures for all steps/milestones during the execution. - The change adequately minimizes risk within the environment/service. - The performance implications of executing the change are well-understood and documented. - The specified metrics/monitoring dashboards provide sufficient visibility for the change. - If not, is it possible (or necessary) to make changes to observability platforms for added visibility? - The change has a primary and secondary SRE with knowledge of the details available during the change window. - The labels ~"blocks deployments" and/or ~"blocks feature-flags" are applied as necessary ## Change Technician checklist <!-- To find out who is on-call, in #production channel run: /chatops run oncall production. --> - [ ] Check if all items below are complete: - The [change plan](#detailed-steps-for-the-change) is technically accurate. - This Change Issue is linked to the appropriate Issue and/or Epic - Change has been tested in staging and results noted in a comment on this issue. - A dry-run has been conducted and results noted in a comment on this issue. - The change execution window respects the [Production Change Lock periods](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#production-change-lock-pcl). - For ~C1 and ~C2 change issues, the change event is added to the [GitLab Production](https://calendar.google.com/calendar/embed?src=gitlab.com_si2ach70eb1j65cnu040m3alq0%40group.calendar.google.com) calendar. - For ~C1 and ~C2 change issues, the SRE on-call has been informed prior to change being rolled out. (In #production channel, mention `@sre-oncall` and this issue and await their acknowledgement.) - For ~C1 and ~C2 change issues, the SRE on-call provided approval with the ~eoc_approved label on the issue. - For ~C1 and ~C2 change issues, the Infrastructure Manager provided approval with the ~manager_approved label on the issue. - Release managers have been informed (If needed! Cases include DB change) prior to change being rolled out. (In #production channel, mention `@release-managers` and this issue and await their acknowledgment.) - There are currently no [active incidents](https://gitlab.com/gitlab-com/gl-infra/production/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=Incident%3A%3AActive) that are ~severity::1 or ~severity::2 - If the change involves doing maintenance on a database host, an appropriate silence targeting the host(s) should be added for the duration of the change.
issue