Skip to content

Find way to `dependent: :destroy` without creating transactions

We have several examples of cross-database modifications (eg. executing queries against 2 different databases within the context of a transaction) that are caused by the use of dependent: :destroy. One example is:

class Project
  has_many :builds, class_name: 'Ci::Build', inverse_of: :project, dependent: :destroy # rubocop:disable Cop/ActiveRecordDependent
end

The usual use case for such things is cleaning up data that lives outside of Postgres. One common example being files that live in object storage that relate to builds/artifacts and so on.

When you execute project.destroy it does:

  1. Start a transaction
  2. Loop through all builds calling destroy on each
  3. Destroy the project
  4. Commit the transaction

This code will all work still when we have builds on a separate DB but we will lose the transactional guarantee that any failure during steps 2/3 will mean the entire transaction is rolled back and nothing is deleted. When we move to 2 separate databases we are just going to have to live with possible data inconsistencies like this.

Possible solutions

  1. We are implementing an alternative to Postgres's cascading deletes in #338535 (closed) and there has been some discussion about extending this mechanism to support the destroy case as well (ie. we load the models in our cleanup worker and call #destroy on them allowing them to do whatever cleanup they need) but it was mentioned that this may not work if the parent object no longer exists as data from the parent object may be necessary for the cleanup. We should investigate to see if there is any way to handle these cases. We may still want to implement this as an option for cases where it will work and then fan out more complicated cases to the relevant team to come up with a specific solution to their case
  2. Schedule the cleanup async in sidekiq. This would give us better guarantee that the work will complete eventually (due to sidekiq retries) than doing it sync in the single request context as short timeouts could cause this to lead to partial cleanup regularly
  3. Schedule the cleanup after the first transaction completes

TODO

  1. List all examples that violate cross-db modification
  2. For each example describe high level what the purpose of dependent: :destroy is
  3. Can examples be made async? Do they need a parent object present and if so why?
  4. Can the deletes remain sync but be done after the transaction keeping parent objects in memory? Worst case scenario is partial cleanup? Is this bad? Loose FK constraints will enforce DB cleanup later but there may remain partial cleanup of object storage? Can we even guarantee perfect object storage cleanup today?
  5. Also look into FastDestroyAll and usages and see if it has the same problem
  6. What code paths are calling FastDestroyAll.fast_destroy_all directly and do they actually cross-db boundaries in a transaction?

Examples

  1. app/models/project.rb: has_many :builds, class_name: 'Ci::Build', inverse_of: :project, dependent: :destroy
    1. Fanned out to grouppipeline authoring #340256 (closed)
  2. app/models/project.rb: has_one :ci_cd_settings, class_name: 'ProjectCiCdSetting', inverse_of: :project, autosave: true, dependent: :destroy
    1. Seems to be have been added in gitlab-foss!18745 (diffs) as some kind of workaround for import/export issues which weirdly related to project.update(ci_cd_settings: nil). This appears to not actually be critically necessary for it to be a destroy and operation as there is no destroying needed. I suspect we'll be able to implement a different workaround here.
    2. This is actually redundant. Since the table is called project_ci_cd_settings (and not ci_cd_settings like I thought) then it won't actually be moving to the new CI database. I did actually create an MR to remove this !69947 (closed) but realised it's not necessary. The code is still redundant and we could remove it but it's not a blocker so I'll leave this alone for now.
  3. app/models/user.rb: has_many :triggers, dependent: :destroy, class_name: 'Ci::Trigger', foreign_key: :owner_id
    1. I was able to trace this back to this comment gitlab-foss!9713 (comment 24835526) but the MR gives no indication about why the cascading delete is not sufficient.
    2. MR to fix it !69721 (merged)
  4. User.has_many :builds, dependent: :nullify, class_name: 'Ci::Build' and User.has_many :pipelines, dependent: :nullify, class_name: 'Ci::Pipeline'\
    1. Extracted to #340260 (closed)

FastDestroyAll examples

  1. app/models/project.rb: use_fast_destroy :build_trace_chunks
    1. If using the fast_destroy_all method then this could theoretically have the same problem since it calls delete_all. I've yet to determine the code path that uses it but it does not seem to be the same path of deleting a Project
    2. For the Project.use_fast_destroy :build_trace_chunks case it seems it does not actually have this transaction problem as it doesn't ever actually call delete or delete_all on the ci_build_trace_chunks. Checking the output from Project#destroy I can't see any deletes happening on ci_build_trace_chunks and so I think it seems to be relying on Postgres cascading deletes. The code path is:
      1. use_fast_destroy calls =>
      2. set_callback :destroy, :before
Edited by Dylan Griffith