Log or back up records in destructive migrations
Proposal
During a recent incident (gitlab-com/gl-infra/production#6072 (closed)) some records were accidentally deleted and needed to be restored. There was no backup of part of deleted data. As one of corrective actions it was proposed to add to our development documentation a rule that any records being deleted should be logged (!77103 (closed)). There were some valid concerns raised in the discussion on the merge request (among other privacy concern - !77103 (comment 790194601)) and also some alternative options were mentioned (specifically using deleted_rows
table - !77103 (comment 790605236)). Given the valid concerns the MR wouldn't be merged as is, so let's discuss the best approach (and track progress) on this issue.
The purpose of this issue is to discuss further if we can log or back up deleted data (temporarily, until the migration is verified) and if yes, what would be the best approach.
Possible options
- use structured logging for logging deleted records
- pros: easy to use, can be used in dry run mode if needed, easy inspection of logged data (if made available in kibana)
- cons (!77103 (comment 790194601)): logging data in kibana might expose sensitive data (this is a major issue, we would have to make sure no sensitive data are exposed), logging too many operations, logging would block transaction (would have to be done outside of transaction)
- use a separate universal table for logging (!77103 (comment 790605236)):
- pros: easy to use, maintenance could be automated (automatically delete old records), no concern about exposing sensitive data
- cons: additional/extra load on DB, extra insert would block transaction (would have to be done outside of transaction)
- use a temporary table for each migration (!77103 (comment 790620384))
- pros: data for each migration would be stored separately (so cleanup would be just matter of
DROP TABLE ...
) - cons: additional implementation overhead (can not be used easily - each destructive migration would have to create its own backup table and also take care of cleanup)