Skip to content

Revert "Patch Prometheus services records"

Mayra Cabrera requested to merge revert-e8bb8a56 into master

What does this MR do?

Reverts !19956 (merged) the post-migration failed on staging https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/839358

TASK [Run migrations] **********************************************************
108 fatal: [deploy-01-sv-gstg.c.gitlab-staging-1.internal]: FAILED! => changed=true 
109   cmd:
110   - /usr/bin/gitlab-rake
111   - db:migrate
112   delta: '0:02:04.549186'
113   end: '2020-01-13 21:58:11.302126'
114   msg: non-zero return code
115   rc: 1
116   start: '2020-01-13 21:56:06.752940'
117   stderr: |-
118     rake aborted!
119     StandardError: An error has occurred, all later migrations canceled:
120   
121     PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
122     : SELECT  "projects"."id" FROM "projects" LEFT JOIN services ON services.project_id = projects.id AND services.type = 'PrometheusService' WHERE (services.id IS NULL OR (services.active = FALSE AND services.properties = '{}')) AND "projects"."id" >= 154014 GROUP BY projects.id ORDER BY "projects"."id" ASC LIMIT 1 OFFSET 500
123     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
124     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
125     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
126     /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20191220102807_patch_prometheus_services_for_shared_cluster_applications.rb:60:in `up'
127     /opt/gitlab/embedded/bin/bundle:23:in `load'
128     /opt/gitlab/embedded/bin/bundle:23:in `<main>'
129   
130     Caused by:
131     ActiveRecord::QueryCanceled: PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
132     : SELECT  "projects"."id" FROM "projects" LEFT JOIN services ON services.project_id = projects.id AND services.type = 'PrometheusService' WHERE (services.id IS NULL OR (services.active = FALSE AND services.properties = '{}')) AND "projects"."id" >= 154014 GROUP BY projects.id ORDER BY "projects"."id" ASC LIMIT 1 OFFSET 500
133     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
134     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
135     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
136     /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20191220102807_patch_prometheus_services_for_shared_cluster_applications.rb:60:in `up'
137     /opt/gitlab/embedded/bin/bundle:23:in `load'
138     /opt/gitlab/embedded/bin/bundle:23:in `<main>'
139   
140     Caused by:
141     PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
142     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
143     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
144     /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
145     /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20191220102807_patch_prometheus_services_for_shared_cluster_applications.rb:60:in `up'
146     /opt/gitlab/embedded/bin/bundle:23:in `load'
147     /opt/gitlab/embedded/bin/bundle:23:in `<main>'
148     Tasks: TOP => db:migrate
149     (See full trace by running task with --trace)
150   stderr_lines: <omitted>
151   stdout: |-
152     == 20191128162854 DropProjectCiCdSettingsMergeTrainsEnabled: migrating ========
153     -- remove_column(:project_ci_cd_settings, :merge_trains_enabled)
154        -> 0.0076s
155     == 20191128162854 DropProjectCiCdSettingsMergeTrainsEnabled: migrated (0.0077s)
156   
157     == 20191220102807 PatchPrometheusServicesForSharedClusterApplications: migrating
158   stdout_lines: <omitted>
159 NO MORE HOSTS LEFT *************************************************************
160 PLAY RECAP *********************************************************************
161 deploy-01-sv-gstg.c.gitlab-staging-1.internal : ok=6    changed=0    unreachable=0    failed=1    skipped=1    rescued=0    ignored=0   

Post-migration passed on the second retry (https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/839428), but it's worrisome that failed inside of an environment that has barely traffic, so I'm reverting to err on the safe side.

Failure can be analyzed and if it's considered a "one-time-thing" we can merge it again.

Does this MR meet the acceptance criteria?

Conformity

Edited by Mayra Cabrera

Merge request reports