Geo - PruneEventLogWorker leaves orphaned entries in the related events table around

Research

During my investigation of repository checksum mismatches on GPRD, I noticed there is about 35514113 geo_repository_updated_events entries leftover, while there are only 11388446 entries in the geo_event_log table:

[ gprd ] production> update_events = Geo::RepositoryUpdatedEvent.count
=> 46902559

[ gprd ] production> events = Geo::EventLog.where.not(repository_updated_event_id: nil).count
=> 11388446

[ gprd ] production> update_events - events
=> 35514113
gitlabhq_production=> \d geo_event_log
                                                  Table "public.geo_event_log"
               Column                |            Type             |                         Modifiers
-------------------------------------+-----------------------------+------------------------------------------------------------
 id                                  | bigint                      | not null default nextval('geo_event_log_id_seq'::regclass)
 created_at                          | timestamp without time zone | not null
 repository_updated_event_id         | bigint                      |
 repository_deleted_event_id         | bigint                      |
 repository_renamed_event_id         | bigint                      |
 repositories_changed_event_id       | bigint                      |
 repository_created_event_id         | bigint                      |
 hashed_storage_migrated_event_id    | bigint                      |
 lfs_object_deleted_event_id         | bigint                      |
 hashed_storage_attachments_event_id | bigint                      |
 job_artifact_deleted_event_id       | bigint                      |
 upload_deleted_event_id             | bigint                      |
Indexes:
    "geo_event_log_pkey" PRIMARY KEY, btree (id)
    "index_geo_event_log_on_repositories_changed_event_id" btree (repositories_changed_event_id)
    "index_geo_event_log_on_repository_created_event_id" btree (repository_created_event_id)
    "index_geo_event_log_on_repository_deleted_event_id" btree (repository_deleted_event_id)
    "index_geo_event_log_on_repository_renamed_event_id" btree (repository_renamed_event_id)
    "index_geo_event_log_on_repository_updated_event_id" btree (repository_updated_event_id)
Foreign-key constraints:
    "fk_176d3fbb5d" FOREIGN KEY (job_artifact_deleted_event_id) REFERENCES geo_job_artifact_deleted_events(id) ON DELETE CASCADE
    "fk_27548c6db3" FOREIGN KEY (hashed_storage_migrated_event_id) REFERENCES geo_hashed_storage_migrated_events(id) ON DELETE CASCADE
    "fk_4a99ebfd60" FOREIGN KEY (repositories_changed_event_id) REFERENCES geo_repositories_changed_events(id) ON DELETE CASCADE
    "fk_86c84214ec" FOREIGN KEY (repository_renamed_event_id) REFERENCES geo_repository_renamed_events(id) ON DELETE CASCADE
    "fk_9b9afb1916" FOREIGN KEY (repository_created_event_id) REFERENCES geo_repository_created_events(id) ON DELETE CASCADE
    "fk_c1f241c70d" FOREIGN KEY (upload_deleted_event_id) REFERENCES geo_upload_deleted_events(id) ON DELETE CASCADE
    "fk_c4b1c1f66e" FOREIGN KEY (repository_deleted_event_id) REFERENCES geo_repository_deleted_events(id) ON DELETE CASCADE
    "fk_d5af95fcd9" FOREIGN KEY (lfs_object_deleted_event_id) REFERENCES geo_lfs_object_deleted_events(id) ON DELETE CASCADE
    "fk_rails_78a6492f68" FOREIGN KEY (repository_updated_event_id) REFERENCES geo_repository_updated_events(id) ON DELETE CASCADE

Bug

The problem here is that geo_repository_updated_events is the parent table. If you delete any entry from geo_repository_updated_events, then the corresponding row in geo_events will get pruned, but not vice versa.

Proposed changes

  • Modify the Geo::PruneEventLogWorker so it deletes rows from the individual event tables
  • Add post-deploy migration that cleans up the orphaned events in the event tables

/cc @stanhu @toon

Edited Jun 14, 2018 by Toon Claes
Assignee Loading
Time tracking Loading