Skip to content
Snippets Groups Projects

Add rake task for copying 'main' database to 'ci' database

1 unresolved thread

What does this MR do and why?

We are working towards making decomposed database setup a Beta feature. As part of that, it should be more easy for self-managed customers to migrate from one-database setup to decomposed two-database setup.

This MR adds a rake task that will take care of dumping the gitlabhq_production database and importing it in gitlabhq_production_ci database. This rake task will be used by all installation methods that we support.

This rake tasks will be called by installation-specific scripts. A first follow-up MR will be to add the script to Omnibus package. We can then iterate on this to have scripts for other installation methods.

There are some checks before we do anything:

  • Check if GitLab is already decomposed
  • Ensure we have enough local disk space for the dump of main database
  • Ensure the new gitlabhq_production_ci is accessible and empty
  • Ensure there are no active running Background Migrations

The dump is sharing some code with the Backup tasks. This makes it possible to override default database settings using the same environment variables as we support for Backups. Because using GITLAB_BACKUP_PGHOST variables names looked weird, I added support for using GITLAB_OVERRIDE_PGHOST

The database dump is using pg_dump -Fd, this creates dump files for each table. This allows us to dump and restore using multiple processes.

Related to #368729 (closed)

How to set up and validate locally

Using GDK:

  1. Modify database.yml: comment out ci section (so Rails is using a single database)
  2. Drop and create ci database gdk psql -c "DROP DATABASE gitlabhq_development_ci";gdk psql -c "CREATE DATABASE gitlabhq_development_ci"
  3. Run the rake task: bundle exec rake gitlab:db:decomposition:migrate
  4. Verify CI database now also have projects: gdk psql -d gitlabhq_development -c "SELECT COUNT(*) FROM projects";gdk psql -d gitlabhq_development_ci -c "SELECT COUNT(*) FROM projects"; should match

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rutger Wessels

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah
  • Thong Kuah removed review request for @tkuah

    removed review request for @tkuah

    • Resolved by Thong Kuah

      @tkuah Based on your comments, I start to wonder if we should split this migration script into two parts: a migration part and a post-migration part. This allows us to let the administrator do some manual work. The process could then be:

      1. Admin will run migrate task (which will do the pre-flight checks and copy the data)
      2. Pause Admin will:
        • Manually review the result of the copy (ie do the database sizes match?)
        • Edit the /etc/gitlab/gitlab.rb file (we can display the changes instead of updating the file by a script)
      3. Admin run post-migrate bash script:
        • gitlab-ctl reconfigure
        • gitlab-rake gitlab:db:lock_writes
        • gitlab-rails runner "Feature.enable(:execute_background_migrations) && Feature.enable(:execute_batched_migrations_on_schedule)"
      4. Admin can then decide to restart gitlab

      What do you think? Administrators do have then a 'mid-migration' moment that allows them to verify the database copy and, in case of problems, allows them to cancel the migration process and restart GitLab

  • Rutger Wessels added 348 commits

    added 348 commits

    • 09ea3765...35254336 - 336 commits from branch master
    • 35254336...fb8483f5 - 2 earlier commits
    • 77d967f5 - Add check for single db setup
    • 2499c84c - Add check for ensuring ci database has been created
    • 1ba30512 - Add check for ensuring ci database is empty
    • a9a2faf4 - More generic message if ci database is not found
    • cf6f7905 - Add rake task for migrating single DB to two db setup
    • 5e036c89 - Allow overrriding backup location
    • 97bd27f6 - Randomize temporary backup location name
    • 285008bf - Add bash script for migrating to decomposed setup
    • df573c2b - Support GITLAB_OVERRIDE_* environment variables
    • 7678d76b - Allow configuring backup base location

    Compare with previous version

  • Rutger Wessels added 484 commits

    added 484 commits

    • 7678d76b...ae3b4d35 - 469 commits from branch master
    • ae3b4d35...f2455e71 - 5 earlier commits
    • 8197f686 - More generic message if ci database is not found
    • 3d7221ae - Add rake task for migrating single DB to two db setup
    • 8348a959 - Allow overrriding backup location
    • 35a2cab1 - Randomize temporary backup location name
    • 2b54a982 - Add bash script for migrating to decomposed setup
    • 982dc19a - Support GITLAB_OVERRIDE_* environment variables
    • 6c1e9a19 - Allow configuring backup base location
    • fc92e0fd - Allow administrator to edit config
    • cf0153a4 - Moved background migration check out of shell script
    • c3bf31be - Remove bash scripts

    Compare with previous version

  • Rutger Wessels added 484 commits

    added 484 commits

    • 7678d76b...ae3b4d35 - 469 commits from branch master
    • ae3b4d35...f2455e71 - 5 earlier commits
    • 8197f686 - More generic message if ci database is not found
    • 3d7221ae - Add rake task for migrating single DB to two db setup
    • 8348a959 - Allow overrriding backup location
    • 35a2cab1 - Randomize temporary backup location name
    • 2b54a982 - Add bash script for migrating to decomposed setup
    • 982dc19a - Support GITLAB_OVERRIDE_* environment variables
    • 6c1e9a19 - Allow configuring backup base location
    • fc92e0fd - Allow administrator to edit config
    • cf0153a4 - Moved background migration check out of shell script
    • c3bf31be - Remove bash scripts

    Compare with previous version

  • Rutger Wessels changed title from Omnibus: Add script for migrating to decomposed database setup to Add rake task for copying 'main' database to 'ci' database

    changed title from Omnibus: Add script for migrating to decomposed database setup to Add rake task for copying 'main' database to 'ci' database

  • Rutger Wessels changed the description

    changed the description

  • Rutger Wessels changed the description

    changed the description

  • Rutger Wessels requested review from @terrichu and @tkuah

    requested review from @terrichu and @tkuah

  • Thong Kuah
  • Thong Kuah approved this merge request

    approved this merge request

  • Thong Kuah requested review from @jarka and removed review request for @tkuah

    requested review from @jarka and removed review request for @tkuah

  • Rutger Wessels added 146 commits

    added 146 commits

    • c3bf31be...943a51dc - 130 commits from branch master
    • 943a51dc...64110428 - 6 earlier commits
    • 1f3c5680 - Add rake task for migrating single DB to two db setup
    • e3fba61a - Allow overrriding backup location
    • a8777e18 - Randomize temporary backup location name
    • 52a0ca22 - Add bash script for migrating to decomposed setup
    • 604f0375 - Support GITLAB_OVERRIDE_* environment variables
    • 1c8b5940 - Allow configuring backup base location
    • e929bd1f - Allow administrator to edit config
    • ce87e259 - Moved background migration check out of shell script
    • d5c4079c - Remove bash scripts
    • e51376fd - Remove typo

    Compare with previous version

  • Rutger Wessels removed review request for @jarka

    removed review request for @jarka

  • Rutger Wessels requested review from @jarka

    requested review from @jarka

  • Terri Chu
  • Terri Chu
  • Terri Chu
  • Terri Chu
  • Terri Chu
  • Terri Chu removed review request for @terrichu

    removed review request for @terrichu

  • Terri Chu
  • Rutger Wessels added 80 commits

    added 80 commits

    • e51376fd...b42e77e3 - 62 commits from branch master
    • b42e77e3...bb3953ea - 8 earlier commits
    • fe025bd5 - Randomize temporary backup location name
    • 9d9c0702 - Add bash script for migrating to decomposed setup
    • b579a246 - Support GITLAB_OVERRIDE_* environment variables
    • 099305d3 - Allow configuring backup base location
    • c0f943cc - Allow administrator to edit config
    • 79d7edca - Moved background migration check out of shell script
    • 94c9876a - Remove bash scripts
    • 2f2a73a2 - Remove typo
    • a866284b - Only create test data for the test that actually duplicates the data
    • a92e4ebb - Extract diskpace headroom factor into a constant

    Compare with previous version

  • Rutger Wessels requested review from @terrichu

    requested review from @terrichu

  • Terri Chu approved this merge request

    approved this merge request

  • added databasereviewed label and removed databasereview pending label

  • Terri Chu requested review from @Quintasan and removed review request for @terrichu

    requested review from @Quintasan and removed review request for @terrichu

  • Jarka Košanová removed review request for @jarka

    removed review request for @jarka

  • Jarka Košanová approved this merge request

    approved this merge request

  • Rutger Wessels added 623 commits

    added 623 commits

    • a92e4ebb...1e639b8f - 604 commits from branch master
    • 1e639b8f...8a132034 - 9 earlier commits
    • 0d0cc293 - Add bash script for migrating to decomposed setup
    • 92b84a3c - Support GITLAB_OVERRIDE_* environment variables
    • cad4c7a1 - Allow configuring backup base location
    • 4dfdf901 - Allow administrator to edit config
    • db9d89ad - Moved background migration check out of shell script
    • b8c7ac00 - Remove bash scripts
    • f9631f1d - Remove typo
    • f0e8679e - Only create test data for the test that actually duplicates the data
    • a2487dcc - Extract diskpace headroom factor into a constant
    • e546fc2a - Create directory if it does not exist

    Compare with previous version

  • Rutger Wessels added 614 commits

    added 614 commits

    • e546fc2a...c916a646 - 595 commits from branch master
    • c916a646...2157b194 - 9 earlier commits
    • d3e92d1b - Add bash script for migrating to decomposed setup
    • 3079892b - Support GITLAB_OVERRIDE_* environment variables
    • 6706ff64 - Allow configuring backup base location
    • 93df3d89 - Allow administrator to edit config
    • bacb7b01 - Moved background migration check out of shell script
    • fa79daab - Remove bash scripts
    • 6c7c3749 - Remove typo
    • c5815a30 - Only create test data for the test that actually duplicates the data
    • e012d506 - Extract diskpace headroom factor into a constant
    • 2a808ac2 - Create directory if it does not exist

    Compare with previous version

  • @Quintasan Can you please have a look? I did start a new CI build because there was one failing job (danger-local) but no spec failures.

  • Michał Zając resolved all threads

    resolved all threads

  • Michał Zając approved this merge request

    approved this merge request

  • added databaseapproved label and removed databasereviewed label

  • Michał Zając resolved all threads

    resolved all threads

  • Michał Zając enabled an automatic merge when the pipeline for dff6cac9 succeeds

    enabled an automatic merge when the pipeline for dff6cac9 succeeds

  • merged

  • Hello @rutgerwessels :wave:

    The database team is looking for ways to improve the database review process and we would love your help!

    If you'd be open to someone on the database team reaching out to you for a chat, or if you'd like to leave some feedback asynchronously, just post a reply to this comment mentioning:

    @gitlab-org/database-team

    And someone will be by shortly!

    Thanks for your help! :heart:

    This message was generated automatically. You're welcome to improve it.

  • Michał Zając mentioned in commit 743cf835

    mentioned in commit 743cf835

  • added workflowstaging label and removed workflowcanary label

  • mentioned in issue #368729 (closed)

  • mentioned in merge request omnibus-gitlab!7266 (merged)

  • Gabriel Mazetto mentioned in merge request !140122 (merged)

    mentioned in merge request !140122 (merged)

  • 110 output, status = with_transient_pg_env(ci_config[:pg_env]) do
    111 psql_args = ["--dbname=#{ci_database_name}", "-tAc", sql]
    112
    113 Open3.capture2e('psql', *psql_args)
    114 end
    115
    116 unless status.success? && output.chomp.to_i == 0
    117 raise MigrateError,
    118 "Database '#{ci_database_name}' is not empty"
    119 end
    120
    121 true
    122 end
    123
    124 def background_migrations_done?
    125 unfinished_count = Gitlab::Database::BackgroundMigration::BatchedMigration.without_status(:finished).count
    • @rutgerwessels / @Quintasan - what about status finalized (6) here - cannot run migration in omnibus because 19 of my background migrations are in status finalized (6). (trying in GitLab 17.0 currently).

    • for now i have managed to migrate successfully by executing (ids identified by filtering migrations with status 6).

      pre-decomposition in gitlabhq_production:

      update batched_background_migrations set status = 3 where id in (211, 210, 209, 208, 206, 203, 202, 195, 193, 186, 182, 181, 180, 179, 164, 106, 103, 41, 38)

      post-decomposition in gitlabhq_production & gitlabhq_production_ci:

      update batched_background_migrations set status = 6 where id in (211, 210, 209, 208, 206, 203, 202, 195, 193, 186, 182, 181, 180, 179, 164, 106, 103, 41, 38)
    • @swiffer Thanks, good find. The intention is that we don't have running background migrations during the migration. I will create an issue for this.

      Edited by Rutger Wessels
    • Please register or sign in to reply
  • Rutger Wessels resolved all threads

    resolved all threads

  • Rutger Wessels mentioned in issue #462424

    mentioned in issue #462424

  • mentioned in issue #474638 (closed)

  • Please register or sign in to reply
    Loading