Review use of caching throughout Cloud Connector architecture
In recent Cloud Connector incidents, caching has been involved multiple times, sometimes as the direct cause of the problem, and at other times just playing a part in reducing/worsening the impact of incidents.
This issue is to:
- Complete the below table listing of all the places in the Cloud Connector architecture where in memory or Redis caches are used, (or have been used and are now removed after past incidents)
- Review each of the remaining uses of caching and assess if there are improvements that could be made to improve system resilience
Caching area | Current Status | Future Plan |
---|---|---|
Cloud Connector AvailableServices | removed for Self Managed since 17.2.4 | n/a |
Public Key in memory caching through CDot/.com OIDC endpoint | in place | https://gitlab.com/gitlab-org/gitlab/-/issues/483041+ |
https://gitlab.com/gitlab-org/gitlab/-/issues/498456+ | issue resolved by not using in memory cache. Fixed in 17.5 | n/a |
any others? |
Edited by Paul Phillips