Deleted Projects::ImportExport::RelationExportUpload records leave behind Upload records

Summary

During investigation of Geo verification failures on a GitLab Dedicated instance, we discovered that uploads rows are likely being orphaned when their associated Projects::ImportExport::RelationExportUpload records are deleted. The orphaned uploads represent bad data which causes errors during data integrity checks.

Background

This issue was discovered during investigation of RFH - Geo - Investigate and remediate upload and repository verification failures.

The customer reported that uploads were not reaching 100% checksummed on the primary site. Investigation revealed 3241 orphaned uploads rows with checksum failures like:

Error during verification: The model which owns this Upload is missing. Upload ID#396877, Projects::ImportExport::RelationExportUpload ID#243922

Problem Description

When Projects::ImportExport::RelationExportUpload records are deleted (as intended), their associated uploads rows are not being deleted (unintended). This leaves orphaned uploads rows that reference non-existent rows, causing:

  1. Geo verification failures on the primary site
  2. Inability to reach 100% checksum progress, potentially obscuring actual failures
  3. Application errors when trying to checksum these uploads

Evidence

From the investigation:

  • Total orphaned uploads found: 3,241
  • Model type affected: Projects::ImportExport::RelationExportUpload
  • Uploader: ImportExportUploader
  • Error pattern: NoMethodError: undefined method 'underscore' for NilClass:Class

Sample orphaned upload record:

#<Upload id: 343563, size: 105, path: \"projects/import_export/relation_export_upload/expo...\", checksum: nil, model_id: 198265, model_type: \"Projects::ImportExport::RelationExportUpload\", uploader: \"ImportExportUploader\", created_at: \"2025-01-02 17:47:19.367923000 +0000\", store: 2, mount_point: \"export_file\", secret: nil, version: 2, uploaded_by_user_id: nil, organization_id: nil, namespace_id: nil, project_id: nil, verification_checksum: nil>

Related Issues

This appears to be similar to previously reported issues:

There was a previous fix in !142246 (merged) that was meant to remove linked Upload records of expired RelationExportUpload records, but this issue suggests the problem persists or the service needs to run again if the bug was already fixed.

Expected Behavior

When a Projects::ImportExport::RelationExportUpload record is deleted, its associated uploads record should also be deleted.

Current Workaround

This is a script to identify and delete orphaned uploads rows (with proper logging for audit purposes). Run it in Rails console in the primary site. This provides temporary relief rather than addressing the root cause and new occurrences.

Click to expand workaround
# This snippet deletes rows in the uploads table if the associated parent
# "model" no longer exists.
# This must be run in Rails console in the primary site.
def delete_orphaned_uploads(dry_run: true, output_file_path: '/tmp/orphaned_uploads.txt', model_types: nil)
  logger = stdout_and_file_logger(output_file_path)

  if dry_run
    logger.info "This is a dry run. Upload rows will only be printed."
  else
    logger.warn "This is NOT A DRY RUN! Upload rows will be deleted from the DB!"

    # Add confirmation for destructive operations
    print "Are you sure you want to delete these uploads? Type 'yes' to continue: "
    confirmation = $stdin.gets.chomp
    unless confirmation.casecmp('yes') == 0
      logger.info "Operation cancelled by user."
      return
    end
  end

  orphaned_upload_states = Geo::UploadState.where(
    "(verification_failure LIKE ? OR verification_failure = ?) AND verification_checksum IS NULL",
    'Error during verification: The model which owns this Upload is missing.%',
    "Error during verification: undefined method `underscore' for NilClass:Class"
  )

  uploads = Upload.joins(:upload_state).merge(orphaned_upload_states)

  # Add model_type filtering if specified
  if model_types.present?
    model_types = Array(model_types) # Ensure it's an array
    uploads = uploads.where(model_type: model_types)
    logger.info "Filtering for model_types: #{model_types.join(', ')}"
  end

  total_count = uploads.count
  logger.info "Found #{total_count} uploads with a model that does not exist"

  uploads_deleted = 0
  uploads_failed = 0

  logger.info "#{dry_run ? 'Dry run. Listing' : 'Deleting'} orphaned uploads..."

  uploads.find_each do |upload|
    logger.debug upload.to_json

    if upload.model
      logger.info "The model actually exists for upload ID: #{upload.id}. Skipping."
      next
    end

    unless dry_run
      upload.destroy!
      uploads_deleted += 1
    end
  rescue StandardError => e
    uploads_failed += 1
    logger.error "Failed to delete upload ID: #{upload.id} - #{e.message}"
    logger.debug e.backtrace.join("\n")
  end

  if dry_run
    logger.info "Dry run completed. #{uploads.count} uploads processed."
  else
    logger.info "Deletion completed. #{uploads_deleted} uploads deleted, #{uploads_failed} failed."
  end
end

def stdout_and_file_logger(log_file, level: Logger::DEBUG)
  FileUtils.mkdir_p(File.dirname(log_file))

  stdout_logger = Logger.new($stdout)
  stdout_logger.level = level
  file_logger = Logger.new(log_file)
  file_logger.level = level

  ActiveSupport::BroadcastLogger.new(stdout_logger, file_logger)
end

After defining two methods above, do a dry run:

delete_orphaned_uploads(model_types: 'Projects::ImportExport::RelationExportUpload')

After doing a dry run and examining the output (it also outputs to /tmp/orphaned_uploads.txt for your convenience), delete the uploads rows:

delete_orphaned_uploads(dry_run: false, model_types: 'Projects::ImportExport::RelationExportUpload')

Impact

  • Customer Impact: Prevents Geo sites from achieving 100% verification, obscuring potential
  • Operational Impact: Requires frequent manual cleanup of orphaned records
  • Scale: Affects GitLab Dedicated customers using import/export functionality

Requested Action

  1. Investigate why uploads rows are not being cleaned up when Projects::ImportExport::RelationExportUpload records are deleted
  2. Identify if this is a regression or if the cleanup mechanism is not working as expected
  3. Implement proper cascade deletion or cleanup mechanism
  4. Consider if the previous fix in MR 142246 needs to be re-run or enhanced

Additional Context

  • GitLab Version: 18.0.3-ee (GitLab Dedicated)
  • Environment: Production
  • Deployment: GitLab Dedicated

If immediate identification or reproduction is not possible, then ask Dedicated if new occurrences are happening and if logs around these occurrences can be found for further investigation.

Edited by 🤖 GitLab Bot 🤖