Sync Cloud Connector tokens hourly
What does this MR do and why?
This brings back a change originally introduced in !185454 (merged) but that was reverted due to an issue we found with reading job parameters. It is meant to address a problem with Cloud Connector access tokens expiring on SM/Dedicated due to 3 consecutively missed syncs with CDot, which we found can happen in face of rolling over Sidekiq workers frequently.
We therefore fix both issues together here:
- Pass and read job arguments as JSON, not Ruby symbols
- Set hourly schedule for
SyncServiceTokenWorker
and supportforce
flag to force an upstream request
References
- Related to: https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/26393+
- Original MR: Run sync_service_token_worker randomly once per... (!185454 - merged)
- Revert MR: Revert "Merge branch 'rz-26393-run-SyncServiceT... (!187733 - merged)
How to set up and validate locally
Make sure that you have Duo set up locally, and Puma and Sidekiq are running.
Via Admin UI
- Go to http://gitlab.local:3000/admin/subscription
- Click the retry icon in the subscription; this should trigger the job
- Verify in Sidekiq logs that
SyncServiceTokenWorker
ran to completion and did not log"extra.cloud_connector_sync_service_token_worker.result":"skipping token refresh"
Via console
There are two cases to consider: an argument-less run as cron
would schedule it and an explicit invocation with arguments.
Cron run (no job arguments):
- Run
CloudConnector::SyncServiceTokenWorker.perform_async
- Verify in Sidekiq logs that the job ran to completion and did log
"extra.cloud_connector_sync_service_token_worker.result":"skipping token refresh"
(unless your access token was missing or expired) - Expire or delete the token:
CloudConnector::ServiceAccessToken.delete_all
- Run
CloudConnector::SyncServiceTokenWorker.perform_async
- Verify in Sidekiq logs that the job ran to completion and did not log `"extra.cloud_connector_sync_service_token_worker.result":"skipping
- Verify that
CloudConnector::SyncServiceTokenWorker.last
exists and is current
Explicit run (with job arguments):
- Run
CloudConnector::SyncServiceTokenWorker.perform_async('force' => true)
- Verify in Sidekiq logs that the job ran to completion and did not log
"extra.cloud_connector_sync_service_token_worker.result":"skipping token refresh"
(regardless of whether your access token was missing or expired) - Verify that
CloudConnector::SyncServiceTokenWorker.last
exists and is current
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.