Skip to content

Migration differs between .com and self-managed.

Context

!100609 (merged) added a column using a post-migration which caused problems when tagging and deploying the release candidate for 15.5: pods had to be restarted to account for this new column.

Slack conversation details

Slack link - Internal only

Valery Burton - Looks like this was the MR that introduced default_compliance_framework_id, which should also be a db column, so seems like this could be similar to the last incident: !100609 (merged) For the last incident, I remember sidekiq , web, and api pods were all restarted, and the issue was reproducible on both the UI & API. I ran the failing test after sidekiq and web restart and the test was still failing, but restarting the api did the trick

Ahmad - A restart to the web pods seems to make it work for me, I guess we can restart the rest for good measure

Ahmad - OK, web, api and sidekiq pods are restarted

Nailia Iskhakova - Passed

Amy Phillips- Does this mean all self-managed users will need to restart after installing?

To ensure the problem didn't affect self-managed instances and to continue with the release steps preparation, the migration was moved from a post-migration to a regular one on !101658 (merged). Unfortunately, the MR didn't go through due to a failure in the rspec fail-fast job:

Failed examples:
rspec ./spec/migrations/change_public_projects_cost_factor_spec.rb:48 # ChangePublicProjectsCostFactor#down when on SaaS resets the cost factor to 0 only for shared runners that were updated

The problem was related to an schema refreshing problem and a solution was submitted/merged on !101613 (merged), but the MR failed again with a different failure:

  1303) Migrations Validation migration: #<struct ActiveRecord::MigrationProxy name="CleanupOrphansApprovalProjectRules", version=20220411173544, filename="/builds/gitlab-org/gitlab/db/post_migrate/20220411173544_cleanup_orphans_approval_project_rules.rb", scope=""> uses one of the allowed migration classes
        Failure/Error: super(levels&.map { |level| Gitlab::VisibilityLevel.level_value(level) })
        NoMethodError:
          super: no superclass method `restricted_visibility_levels=' for #<ApplicationSetting >

To unblock the release candidate, the commit on !101658 (merged) was directly cherry-picked into the stable branch and the pipeline succeded there Screen_Shot_2022-10-20_at_17.05.17

Problem

Although cherry-picking the commit into the stable branch unblocked the release preparation, it created a divergence between self-managed and SaSS:

  1. Self-managed will execute the migration as a regular one
  2. GitLab.com executed the migration as a post-migration

Although this divergence should impose a problem (I think), we should fix this problem. The purpose of this issue is to continue investigating why !101658 (merged) continues to fail and fix the divergence

Edited by Mayra Cabrera