Add Sidekiq cron job to clean up test data on GitLab stg
Problem
In a recent triage issue @niskhakova discovered we had over ~1 million test user accounts piling up on staging.
We currently have QA::Tools
module for test data cleanup via API calls, which I think is normally scheduled to run in a delete resource pipeline(?). This normally would cover most scenarios, but there are also resources that cannot be removed via APIs (e.g. a top-level group with paid subscription).
Proposal
Create sidekiq
schedule to run a test data cleanup job on GitLab stg.
1st iteration
In CustomersDot, there is a sidekiq
cron job for test data cleanup to run on staging every 6 hours, also a rake task to run in staging console on demand.
-
Add a sidekiq
worker to remove top-level test groups created infulfillment
tests suite.
2nd iteration
- In the long run maybe we can convert some of the
QA:Tools
function to thesidekiq
job🤔 Compared to current cleanup pipeline via APIs👇
Pros
- directly removing records from db, less expensive than API calls
- bypass deletion validations, more flexibilities
- more pipeline efficiency and requires minimal manual actions.
- having
sidekiq
handling logs and retries
Cons
- requires
sidekiq
worker, should be queued as low priority so it doesn't impact other application workers - difficult to handle certain reusable test resource and resource that's left for debugging purpose
Edited by Chloe Liu