DeleteDiffFilesWorker doesn't work as expected for diffs in external storage
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=241178) </details> <!--IssueSummary end--> ### Summary This worker wipes `MergeRequestDiffFile` records from a `MergeRequestDiff` in some circumstances. However, it was introduced prior to external merge request diff storage. When it is given a merge request diff that is stored externally, it removes the `MergeRequestDiffFile` rows, but it should *also* ensure that the file associated with `MergeRequestDiff#external_diff` is deleted, and that `stored_externally` and `external_diff_store` are set to `false` and `LOCAL`, respectively. ### Steps to reproduce * Run `DeleteDiffFilesWorker` against any merge request diff sat in object storage * Note that the external diff file is still accessible ### What is the current *bug* behavior? Data attached to MergeRequestDiff is not cleaned as expected ### What is the expected *correct* behavior? Data should be removed ### Output of checks This bug happens on GitLab.com ### Possible fixes The actual fix isn't too bad - we should move the logic inside this worker to the `clean!` state transition for the `MergeRequestDiff`, and have it take account of the What to do about pre-existing data is a harder question. When the `MergeRequestDiff` record is removed (say, along with the project), the associated file is destroyed, so it's not *technically* a resource leak. However, I'd feel better if we scanned through all `MergeRequestDiff` records with `state: without_file, stored_externally: true` and cleaned things up eagerly, to free up object storage space.
issue