Skip to content

Follow-up from "Destroy merge request diffs in batches"

The following discussions from !96455 (merged) should be addressed:

  • @drew started a discussion:

    We have a BATCH_SIZE = 100 constant in this class, should we use the same one here?

  • @drew started a discussion:

    the yielded object here is just a relation, not any particular IDs of the records in the relation:

       190:       project.merge_requests.each_batch(column: :iid, of: mr_batch_size) do |relation_ids|
       191:         byebug
    => 192:         loop do
       193:           deleted_rows = MergeRequestDiff
       194:             .where(merge_request_id: relation_ids)
       195:             .limit(delete_batch_size)
       196:             .delete_all
    (byebug) relation_ids
    #<ActiveRecord::AssociationRelation [#<MergeRequest id:1 user1/project1!1>]>

    And the queries use an IN (subquery) structure no matter what:

    (byebug) puts MergeRequestDiff.where(merge_request: relation_ids).to_sql
    SELECT "merge_request_diffs".* FROM "merge_request_diffs" WHERE "merge_request_diffs"."merge_request_id" IN (SELECT "merge_requests"."id" FROM "merge_requests" WHERE "merge_requests"."target_project_id" = 1 AND "merge_requests"."iid" >= 1)
    
    (byebug) puts MergeRequestDiff.where(merge_request_id: relation_ids).to_sql
    SELECT "merge_request_diffs".* FROM "merge_request_diffs" WHERE "merge_request_diffs"."merge_request_id" IN (SELECT "merge_requests"."id" FROM "merge_requests" WHERE "merge_requests"."target_project_id" = 1 AND "merge_requests"."iid" >= 1)

    So it might be clearer to not use _id in the naming here?

          project.merge_requests.each_batch(column: :iid, of: mr_batch_size) do |relation|
            loop do
              deleted_rows = MergeRequestDiff
                .where(merge_request: relation)

    WDYT?

Edited by Vasilii Iakliushin