Skip to content
Snippets Groups Projects

BG migration to fix incorrect job artifacts expire_at on self-managed

Merged Albert requested to merge 355833-migration-to-fix-incorrect-expire-at into master
6 unresolved threads

What does this MR do and why?

Fix remaining incorrect expire_at on self-managed instances that was set by !47723 (merged). This gives instances that may not have fully executed the fix in DestroyBatchService to still have a way to remove the incorrect values.

This MR adds a temporary index, which will be removed later on, tracked in #368979 (closed).

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

DB Queries

Create index:

CREATE INDEX tmp_index_ci_job_artifacts_on_id_expire_at_file_type_trace ON ci_job_artifacts USING btree (id, expire_at, file_type) WHERE (((date_part('day'::text, timezone('UTC'::text, expire_at)) = ANY (ARRAY[(21)::double precision, (22)::double precision, (23)::double precision])) AND (date_part('minute'::text, timezone('UTC'::text, expire_at)) = ANY (ARRAY[(0)::double precision, (30)::double precision, (45)::double precision])) AND (date_part('second'::text, timezone('UTC'::text, expire_at)) = (0)::double precision)) OR (file_type = 3));

Batching and update:

SELECT
    "ci_job_artifacts"."id"
FROM
    "ci_job_artifacts"
WHERE
    "ci_job_artifacts"."id" BETWEEN 1 AND 100
ORDER BY
    "ci_job_artifacts"."id" ASC
LIMIT 1

--

SELECT
    "ci_job_artifacts"."id"
FROM
    "ci_job_artifacts"
WHERE
    "ci_job_artifacts"."id" BETWEEN 1 AND 100
    AND "ci_job_artifacts"."id" >= 1
ORDER BY
    "ci_job_artifacts"."id" ASC
LIMIT 1 OFFSET 10

--

UPDATE
    "ci_job_artifacts"
SET
    "expire_at" = NULL
WHERE
    "ci_job_artifacts"."id" BETWEEN 1 AND 100
    AND "ci_job_artifacts"."id" >= 1
    AND "ci_job_artifacts"."id" < 11
    AND (EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
        AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
        AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0
        OR "ci_job_artifacts"."file_type" = 3)

--

SELECT
    "ci_job_artifacts"."id"
FROM
    "ci_job_artifacts"
WHERE
    "ci_job_artifacts"."id" BETWEEN 1 AND 100
    AND "ci_job_artifacts"."id" >= 11
ORDER BY
    "ci_job_artifacts"."id" ASC
LIMIT 1 OFFSET 10

--

UPDATE
    "ci_job_artifacts"
SET
    "expire_at" = NULL
WHERE
    "ci_job_artifacts"."id" BETWEEN 1 AND 100
    AND "ci_job_artifacts"."id" >= 11
    AND (EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
        AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
        AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0
        OR "ci_job_artifacts"."file_type" = 3)

query plans:

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #355833 (closed)

Edited by Adam Hegyi

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Author Contributor

    Given that we have a fix in place for customers on a regular version-by-version upgrade path, how do we feel about writing this as a regular migration?

    If I recall correctly, customers executing multi-version upgrades expect to do so with downtime. Since the write operation itself is idempotent, we could design a migration that clears the timestamps with a relatively conserved batched pacing strategy, and if it fails or the whole migration operation times out for some reason, the customer can simply retry the migration to pick up where the failed execution left off.

    There are a few contingencies and a willing to accept failing migrations in this plan, so I'm not sure it's the best one. But I do think it would get us more certainty about the eventual outcome.

    @drew to reply to your comments, I still think a batched background migration is preferred, because it gives option for users who have gone version-by-version up to 15.0 and want to continue with 15.1 and 15.2 upgrade without downtime.

    Those that can accept downtime and would want to go directly from 15.0 to 15.2 can still do so with a batched background migration.

  • Albert added 1 commit

    added 1 commit

    • 3bdc36ca - Add background migration to fix incorrect job artifacts expire_at on self-managed

    Compare with previous version

  • Database migrations

    Migrations included in this change have been executed on gitlab.com data for testing purposes. For details, please see the migration testing pipeline (limited access).

    Migration Type Total runtime Result DB size change
    20220606080509 - AddTmpIndexJobArtifactsIdAndExpireAt Regular 1.9 s :white_check_mark: +0.00 B
    20220606054503 - FixIncorrectJobArtifactsExpireAt Post deploy 1.6 s :white_check_mark: +0.00 B

    Migration: 20220606080509 - AddTmpIndexJobArtifactsIdAndExpireAt

    • Type: Regular
    • Duration: 1.9 s
    • Database size change: +0.00 B

    Migration: 20220606054503 - FixIncorrectJobArtifactsExpireAt

    • Type: Post deploy
    • Duration: 1.6 s
    • Database size change: +0.00 B

    Background migrations


    Other migrations pending on GitLab.com
    Migration Type Total runtime Result DB size change

    Clone Details

    Clone ID Clone Created At Clone Data Timestamp Expected Removal Time
    database-testing-1365961-7781634-main 2022-08-16T08:51:18Z 2022-08-16T08:02:34Z 2022-08-16 20:53:46 +0000

    Artifacts


    Brought to you by gitlab-org/database-team/gitlab-com-database-testing. Epic

  • Albert added 736 commits

    added 736 commits

    • 3bdc36ca...77ad7b23 - 735 commits from branch master
    • d2d882a4 - Add background migration to fix incorrect job artifacts expire_at on self-managed

    Compare with previous version

  • Albert added 1 commit

    added 1 commit

    • b13738c5 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert added 2 commits

    added 2 commits

    • feea161a - Add background migration to fix incorrect job artifacts expire_at on self-managed
    • 33753ab6 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert requested review from @drew

    requested review from @drew

  • Albert marked this merge request as ready

    marked this merge request as ready

  • Albert
  • Albert changed the description

    changed the description

  • Albert added 2 commits

    added 2 commits

    • bb53406d - Fix incorrect job artifacts expire_at
    • 3c491e69 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert changed the description

    changed the description

  • drew stachon approved this merge request

    approved this merge request

  • :wave: @drew, thanks for approving this merge request.

    This is the first time the merge request is approved. To ensure full test coverage, a new pipeline has been started.

    For more info, please refer to the following links:

  • drew stachon unapproved this merge request

    unapproved this merge request

  • drew stachon removed review request for @drew

    removed review request for @drew

  • Albert changed the description

    changed the description

  • Albert added 1758 commits

    added 1758 commits

    Compare with previous version

  • Author Contributor

    @dfrazao-gitlab back to you :ping_pong:

  • Albert mentioned in issue #355833 (closed)

    mentioned in issue #355833 (closed)

  • 🤖 GitLab Bot 🤖 changed milestone to %15.2

    changed milestone to %15.2

    • Resolved by Adam Hegyi

      One more thing:

      I see you shared the query + query plan of the update query. Could you also share the query + query plan where we retrieve the batch?


      https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10594/commands/38123

       ModifyTable on public.ci_job_artifacts  (cost=0.57..3.61 rows=1 width=189) (actual time=0.020..0.021 rows=0 loops=1)
         Buffers: shared hit=4
         I/O Timings: read=0.000 write=0.000
         ->  Index Scan using tmp_index_ci_job_artifacts_on_id_expire_at on public.ci_job_artifacts  (cost=0.57..3.61 rows=1 width=189) (actual time=0.018..0.018 rows=0 loops=1)
               Index Cond: ((ci_job_artifacts.id >= '2549151000'::bigint) AND (ci_job_artifacts.id <= '2549161000'::bigint))
               Filter: ((date_part('second'::text, timezone('UTC'::text, ci_job_artifacts.expire_at)) = '0'::double precision) AND (date_part('day'::text, timezone('UTC'::text, ci_job_artifacts.expire_at)) = ANY ('{21,22,23}'::double precision[])) AND (date_part('minute'::text, timezone('UTC'::text, ci_job_artifacts.expire_at)) = ANY ('{0,30,45}'::double precision[])))
               Rows Removed by Filter: 0
               Buffers: shared hit=4
               I/O Timings: read=0.000 write=0.000
      

      From the query plan, I see that we are using the condition (ci_job_artifacts.id >= '2549151000'::bigint) AND (ci_job_artifacts.id <= '2549161000'::bigint)) from the index that we added, but we still filter the data from the data returned by the index.

      We need to double-check this.

      Edited by Diogo Frazão
  • I also recommend changing the title to Add batched background migration to fix incorrect job artifacts expire_at on self-managed to avoid confusion with the background migration framework.

  • Diogo Frazão removed review request for @dfrazao-gitlab

    removed review request for @dfrazao-gitlab

  • Feel free to assign me again when you have this ready.

    Thank you

  • Albert added 3938 commits

    added 3938 commits

    Compare with previous version

  • Albert changed title from Add background migration to fix incorrect job artifacts expire_at on self-managed to Add batched background migration to fix incorrect job artifacts expire_at on self-managed

    changed title from Add background migration to fix incorrect job artifacts expire_at on self-managed to Add batched background migration to fix incorrect job artifacts expire_at on self-managed

  • Albert added 3 commits

    added 3 commits

    • 0aac0053 - Fixme: Remove Gitlab.com? check for DB pipeline
    • 2e666c91 - Add trace job artifact to batching scope
    • a93a9b17 - Add migration spec for post migration

    Compare with previous version

  • Albert added 1 commit

    added 1 commit

    • 5f7be734 - Add migration spec for post migration

    Compare with previous version

  • Albert
  • Albert added 4 commits

    added 4 commits

    • d8f69585 - Add trace job artifact to batching scope
    • 5e17d14b - Add migration spec for post migration
    • e8ea5c7e - Add down migration to remove batch migration job
    • cd67385b - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • drew stachon mentioned in merge request !92931 (merged)

    mentioned in merge request !92931 (merged)

  • 🤖 GitLab Bot 🤖 changed milestone to %15.3

    changed milestone to %15.3

  • requested review from @dfrazao-gitlab

  • Albert removed review request for @dfrazao-gitlab

    removed review request for @dfrazao-gitlab

  • Albert marked this merge request as draft

    marked this merge request as draft

  • Albert added 746 commits

    added 746 commits

    • cd67385b...6810f4df - 739 commits from branch master
    • 4498f020 - Fix incorrect job artifacts expire_at
    • 61be938c - Add trace job artifact to batching scope
    • d5ebf5f6 - Add migration spec for post migration
    • 1d6659bd - Add down migration to remove batch migration job
    • 331b1d05 - Fixme: Remove Gitlab.com? check for DB pipeline
    • ef75c7ac - Temp fix for migration spec helper
    • 4eff1088 - Fix Rubocop warning

    Compare with previous version

  • Albert added 231 commits

    added 231 commits

    • 4eff1088...8f58bc37 - 224 commits from branch master
    • 45ae274d - Fix incorrect job artifacts expire_at
    • 89cb19d1 - Add trace job artifact to batching scope
    • 60a92c8f - Add migration spec for post migration
    • fff95ace - Add down migration to remove batch migration job
    • a4a5f135 - Fixme: Remove Gitlab.com? check for DB pipeline
    • 84d9ef02 - Temp fix for migration spec helper
    • 9fb06dee - Fix Rubocop warning

    Compare with previous version

  • Albert added 2 commits

    added 2 commits

    • cdd529a3 - Fix incorrect job artifacts expire_at
    • 107d84e2 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert changed the description

    changed the description

  • Albert requested review from @dfrazao-gitlab

    requested review from @dfrazao-gitlab

  • Albert
  • Albert
  • Albert changed the description

    changed the description

  • Albert added 2 commits

    added 2 commits

    • 90413d66 - Fix incorrect job artifacts expire_at
    • 6baa402f - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert mentioned in merge request !87523 (closed)

    mentioned in merge request !87523 (closed)

  • Author Contributor

    @ahegyi could you help to review the index please?

  • Albert requested review from @ahegyi

    requested review from @ahegyi

  • Albert marked this merge request as ready

    marked this merge request as ready

  • Albert mentioned in issue #368979 (closed)

    mentioned in issue #368979 (closed)

  • Albert changed the description

    changed the description

  • Albert added 2 commits

    added 2 commits

    • ec132525 - Move where clause to sub batch
    • 1415292c - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert added 1 commit

    added 1 commit

    • 17162c66 - Add custom batching strategy to skip out of scope records

    Compare with previous version

  • 1 # frozen_string_literal: true
    2
    3 class FixIncorrectJobArtifactsExpireAt < Gitlab::Database::Migration[2.0]
    4 disable_ddl_transaction!
    5
    6 # FIXME: Revert to :gitlab_ci.
    7 # :gitlab_main is required during review because db test pipeline is not configured to use ci
    8 restrict_gitlab_migration gitlab_schema: :gitlab_main
  • Albert added 2 commits

    added 2 commits

    • 1bdca5c2 - Add custom batching strategy to skip out of scope records
    • b95115b2 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Ghost User
  • Ghost User
  • Ghost User
  • Albert added 704 commits

    added 704 commits

    • b95115b2...4934301d - 700 commits from branch master
    • d88375a7 - Fix incorrect job artifacts expire_at
    • 5391ef91 - Move where clause to sub batch
    • 7ea559ab - Add custom batching strategy to skip out of scope records
    • d97559c0 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert added 1486 commits

    added 1486 commits

    • d97559c0...a18c1d08 - 1482 commits from branch master
    • 84109556 - Fix incorrect job artifacts expire_at
    • ec74d3d9 - Move where clause to sub batch
    • afc86aa7 - Add custom batching strategy to skip out of scope records
    • 1083e13d - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert mentioned in merge request !95038 (merged)

    mentioned in merge request !95038 (merged)

  • Albert added 788 commits

    added 788 commits

    • 1083e13d...236ecb26 - 784 commits from branch master
    • d140a096 - Fix incorrect job artifacts expire_at
    • 26a95308 - Move where clause to sub batch
    • befc5f2c - Add custom batching strategy to skip out of scope records
    • ba7dfc39 - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Albert added 4 commits

    added 4 commits

    • e1642787 - Fix incorrect job artifacts expire_at
    • 8df0bb01 - Move where clause to sub batch
    • f3c0b54b - Add custom batching strategy to skip out of scope records
    • b2c71cbb - Fixme: Remove Gitlab.com? check for DB pipeline

    Compare with previous version

  • Adam Hegyi
  • Adam Hegyi
  • Adam Hegyi
  • Albert added 825 commits

    added 825 commits

    • b2c71cbb...6f02c1d7 - 819 commits from branch master
    • 80db3465 - Fix incorrect job artifacts expire_at
    • a5d10e14 - Move where clause to sub batch
    • 29fe5724 - Add custom batching strategy to skip out of scope records
    • 37ef6874 - Fixme: Remove Gitlab.com? check for DB pipeline
    • c5370894 - Remove unneeded select distinct ids
    • 834359dd - Check for existing index in migration

    Compare with previous version

  • Albert added 2 commits

    added 2 commits

    • b791f167 - Remove unneeded select distinct ids
    • ab43ee03 - Check for existing index in migration

    Compare with previous version

    • Author Contributor
      Resolved by Albert

      @ahegyi, given we are very close to the 15.3 being released, I'd think we should merge this only in 15.4. That gives us more time to update any documentation if needed. What do you think?

  • 🤖 GitLab Bot 🤖 changed milestone to %15.4

    changed milestone to %15.4

  • Albert added 180 commits

    added 180 commits

    • ab43ee03...4eabac4b - 174 commits from branch master
    • f9f29072 - Fix incorrect job artifacts expire_at
    • 00d232ad - Move where clause to sub batch
    • 3af16057 - Add custom batching strategy to skip out of scope records
    • bd296a00 - Remove unneeded select distinct ids
    • 0d579f02 - Check for existing index in migration
    • b047d31c - Update DB schema for consistency

    Compare with previous version

  • Albert requested review from @drew and @marcel.amirault

    requested review from @drew and @marcel.amirault

  • Albert added 200 commits

    added 200 commits

    • b047d31c...569e2476 - 193 commits from branch master
    • 571e3a0e - Fix incorrect job artifacts expire_at
    • 9c8509b8 - Move where clause to sub batch
    • 03630f32 - Add custom batching strategy to skip out of scope records
    • b8ab267d - Remove unneeded select distinct ids
    • 9837f453 - Check for existing index in migration
    • 416abfb4 - Update DB schema for consistency
    • efa45fc4 - Add version specific upgrade instruction

    Compare with previous version

  • Ghost User
  • Ghost User
  • Ghost User
  • 1 Message
    :book: This merge request adds or changes documentation files. A review from the Technical Writing team before you merge is recommended. Reviews can happen after you merge.

    Documentation review

    The following files require a review from a technical writer:

    • doc/update/index.md

    The review does not need to block merging this merge request. See the:

    Reviewer roulette

    Changes that require review have been detected!

    Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:

    Category Reviewer Maintainer
    backend Sam Figueroa (@sam.figueroa) (UTC+0, 8 hours behind @alberts-gitlab) Sincheol (David) Kim (@dskim_gitlab) (UTC+9.5, 1.5 hours ahead of @alberts-gitlab)
    database Omar Qunsul (@OmarQunsulGitlab) (UTC+2, 6 hours behind @alberts-gitlab) Krasimir Angelov (@krasio) (UTC+12, 4 hours ahead of @alberts-gitlab)
    ~"migration" No reviewer available No maintainer available

    To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot, based on their timezone. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.

    To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.

    Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.

    If needed, you can retry the :repeat: danger-review job that generated this comment.

    Generated by :no_entry_sign: Danger

  • Marcel Amirault approved this merge request

    approved this merge request

  • Marcel Amirault removed review request for @marcel.amirault

    removed review request for @marcel.amirault

  • Albert added 1 commit

    added 1 commit

    • e2b5900a - Add version specific upgrade instruction

    Compare with previous version

  • no schema change; no impact to data warehouse

  • drew stachon
    • Resolved by Albert

      @alberts-gitlab, this is super thorough, excellent work!

      My only concern is about shipping the concurrent index build in db/migrate instead of db/post_migrate. I'm guessing you have them set up that way to be able to ship them at the same time, instead of waiting another week the send the background migration in it's own MR. If database / groupdelivery is cool with this, go for it.

      I'm approving, since all the business logic and Ruby looks both good, and incredibly tested. :clap:

  • drew stachon approved this merge request

    approved this merge request

  • drew stachon removed review request for @drew

    removed review request for @drew

  • Albert added 743 commits

    added 743 commits

    • e2b5900a...343c151a - 735 commits from branch master
    • 34f96994 - Fix incorrect job artifacts expire_at
    • 7f554c52 - Move where clause to sub batch
    • 0698ad20 - Add custom batching strategy to skip out of scope records
    • d831a408 - Remove unneeded select distinct ids
    • 2a54ff3e - Check for existing index in migration
    • efecdf50 - Update DB schema for consistency
    • 807ec61e - Add version specific upgrade instruction
    • ee16a84b - Move index creation to post_migrate

    Compare with previous version

    • Author Contributor

      @ahegyi This is ready now. Could you do a final review please?

    • Thanks, @alberts-gitlab! DB LGTM! :thumbsup:

    • @alberts-gitlab @ahegyi FYI apply_additional_filters is deprecated now, as we no longer need to create custom batching strategy to apply additional filters. We can use the recently introduces scope_to:

      class RemoveBackfilledJobArtifactsExpireAt < BatchedMigrationJob
        scope_to ->(relation) {
          relation.where(EXPIRES_ON_21_22_23_AT_MIDNIGHT_IN_TIMEZONE)
            .or(relation.where(file_type: 3))
        }
      end

      I have a MR to remove apply_additional_filters - !96478 (merged), but it's merge result pipeline failed because this one was merged meanwhile. I'll update the migration and the batching strategy to use the new approach in !96478 (merged).

    • Please register or sign in to reply
  • Adam Hegyi approved this merge request

    approved this merge request

  • added databaseapproved label and removed databasereview pending label

  • Adam Hegyi resolved all threads

    resolved all threads

  • Adam Hegyi changed title from Add batched background migration to fix incorrect job artifacts expire_at on self-managed to BG migration to fix incorrect job artifacts expire_at on self-managed

    changed title from Add batched background migration to fix incorrect job artifacts expire_at on self-managed to BG migration to fix incorrect job artifacts expire_at on self-managed

  • Adam Hegyi enabled an automatic merge when the pipeline for 18df0c12 succeeds

    enabled an automatic merge when the pipeline for 18df0c12 succeeds

  • merged

  • Adam Hegyi mentioned in commit 3e04c65c

    mentioned in commit 3e04c65c

  • added workflowstaging label and removed workflowcanary label

  • 4 disable_ddl_transaction!
    5
    6 INDEX_NAME = 'tmp_index_ci_job_artifacts_on_id_expire_at_file_type_trace'
    7
    8 EXPIRE_AT_ON_22_MIDNIGHT_IN_TIMEZONE_OR_TRACE = <<~SQL
    9 (EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
    10 AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
    11 AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0)
    12 OR file_type = 3
    13 SQL
    14
    15 def up
    16 return if Gitlab.com?
    17 return if index_exists_by_name?(:ci_job_artifacts, INDEX_NAME)
    18
    19 add_concurrent_index :ci_job_artifacts, :id,
  • 21 # on the dates 21, 22, 23 of the month will not be deleted.
    22 # https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
    23 EXPIRES_ON_21_22_23_AT_MIDNIGHT_IN_TIMEZONE = <<~SQL
    24 EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
    25 AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
    26 AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0
    27 SQL
    28
    29 def perform
    30 each_sub_batch(
    31 operation_name: :update_all
    32 ) do |sub_batch|
    33 sub_batch.where(EXPIRES_ON_21_22_23_AT_MIDNIGHT_IN_TIMEZONE)
    34 .or(sub_batch.where(file_type: 3))
    35 .update_all(expire_at: nil)
    36 end
    • @alberts-gitlab @ahegyi When we use apply_additional_filters, the same filters should be applied here as batching_scope, otherwise the batch generated by the job class will be different (may include more rows) from the batch generated by the batching strategy.

      Yes, it's confusing and error-prone, this is one of the reasons for switching to scope_to - the filters defined this way will be used by both the batching strategy and the job class, and we do not have to define custom batching strategy class.

      We're still fine here, as we apply the filters on the yielded relation (sub_batch). I'll update this as part of !96478 (merged) anyway,

    • Updated with !96478 (2097adef), can you please have a look and confirm all is good?

    • Thanks, @krasio! I'll take a look.

    • Please register or sign in to reply
  • Krasimir Angelov mentioned in merge request !96478 (merged)

    mentioned in merge request !96478 (merged)

  • Albert mentioned in merge request !96644 (merged)

    mentioned in merge request !96644 (merged)

  • 380 380 accordingly, while also consulting the
    381 381 [version-specific upgrade instructions](#version-specific-upgrading-instructions):
    382 382
    383 `8.11.Z` -> `8.12.0` -> `8.17.7` -> `9.5.10` -> `10.8.7` -> [`11.11.8`](#1200) -> `12.0.12` -> [`12.1.17`](#1210) -> [`12.10.14`](#12100) -> `13.0.14` -> [`13.1.11`](#1310) -> [`13.8.8`](#1388) -> [`13.12.15`](#13120) -> [`14.0.12`](#1400) -> [`14.3.6`](#1430) -> [`14.9.5`](#1490) -> [`14.10.Z`](#14100) -> [`15.0.Z`](#1500) -> [latest `15.Y.Z`](https://gitlab.com/gitlab-org/gitlab/-/releases)
    383 `8.11.Z` -> `8.12.0` -> `8.17.7` -> `9.5.10` -> `10.8.7` -> [`11.11.8`](#1200) -> `12.0.12` -> [`12.1.17`](#1210) -> [`12.10.14`](#12100) -> `13.0.14` -> [`13.1.11`](#1310) -> [`13.8.8`](#1388) -> [`13.12.15`](#13120) -> [`14.0.12`](#1400) -> [`14.3.6`](#1430) -> [`14.9.5`](#1490) -> [`14.10.Z`](#14100) -> [`15.0.Z`](#1500) -> [`15.4.0`](#1540) -> [latest `15.Y.Z`](https://gitlab.com/gitlab-org/gitlab/-/releases)
    • @alberts-gitlab - I'm planning the 15.3.5 to 15.4.5 migration of a self hosted GitLab, and I'm a bit surprised/worried by this change (came here from the 15.4 Upgrade instructions). Could you clarify something please?

      Here is my ci_job_artifacts table size, and what I understand would be the scope of this background migration:

      gitlab-db=> SELECT COUNT(*) FROM ci_job_artifacts;
        count   
      ----------
       28377368
      (1 row)
      
      gitlab-db=> SELECT COUNT(*)
      FROM ci_job_artifacts
      WHERE EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
        AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
        AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0
         OR (file_type = 3);
        count   
      ----------
       24495566
      (1 row)

      24.5M lines, batch size is 500 and interval is 2 minutes, this would be running for 68 days. Is that correct? (I won't rule out misunderstanding the whole BatchedMigrationJob logic)

      Now, it seems the purpose of this operation is to nullify expire_at of non-trace artifacts when it has been improperly set by some previous migration. But on most of my data, expire_at is already NULL:

      gitlab-db=> SELECT
        file_type,
        CASE WHEN expire_at IS NULL
          THEN 'null'
          ELSE 'not_null'
          END AS expire_at_nullity,
        count(*)
      FROM ci_job_artifacts
      GROUP BY file_type, expire_at_nullity
      ORDER BY count(*) desc;
      
       file_type | expire_at_nullity |  count   
      -----------+-------------------+----------
               3 | null              | 24193110
               1 | not_null          |  1525349
               2 | not_null          |  1525150
               4 | not_null          |   389634
              16 | not_null          |   313331
               3 | not_null          |   302398
               9 | not_null          |    92787
              17 | not_null          |     8798
      ...
      (26 rows)

      And indeed, reducing the migration scope with an expire_at IS NOT NULL (for the file_type = 3 branch) narrows it down to a tiny portion (1.2%) of the planned 24.5M lines:

      gitlab-db=> SELECT COUNT(*)
      FROM ci_job_artifacts
      WHERE EXTRACT(day FROM timezone('UTC', expire_at)) IN (21, 22, 23)
        AND EXTRACT(minute FROM timezone('UTC', expire_at)) IN (0, 30, 45)
        AND EXTRACT(second FROM timezone('UTC', expire_at)) = 0
         OR (file_type = 3 AND expire_at IS NOT NULL);
       count  
      --------
       302456
      (1 row)

      The background migration would then complete in 20 hours. Is that correct?

      Thanks in advance for taking time to confirm or correct this analysis :)

      Edited by Thomas de Grenier de Latour
    • Author Contributor

      @thomasgl-orange The batch will still include file_type=3 with NULL expire_at. https://gitlab.com/gitlab-org/gitlab/-/blob/v15.4.5-ee/lib/gitlab/background_migration/remove_backfilled_job_artifacts_expire_at.rb#L23-32.

      @krasio Is there a way we can backport a patch to this migration so that we can add a condition on expire_at IS NOT NULL?

      My apologies for this oversight. I think we should correct it to improve the performance.

      Edited by Albert
    • @alberts-gitlab This is something we need to check with @gitlab-org/release/managers, but I don't think we backport fixes that far back, unless it's a security fix.

    • Please register or sign in to reply
  • mentioned in issue #416689 (closed)

  • Marius Bobin mentioned in merge request !147443 (merged)

    mentioned in merge request !147443 (merged)

  • Please register or sign in to reply
    Loading