Skip to content

Review/Corrective action for SystemStackError caused by Gitlab::Instrumentation::RedisClusterValidator

corrective action for gitlab-com/gl-infra/production#8141 (closed)

Possible cause

We don't know exactly how !105302 (merged) caused this but there's some ideas by @schin1

  • SystemStackError first caught in https://gitlab.com/gitlab-org/gitlab/-/jobs/3421216645
  • It got resolved with !105302 (5be68b05) (check feature flag only for multi-key commands)
  • I suspect this commit !105302 (2bab323b) to move the feature-flag into SafeRequestStore outside of Gitlab::Instrumentation::RedisClusterValidator "unresolved" the problem of recursive calls. This is because we check feature flag for ALL redis calls instead of checking for multi-key commands.
  • But the same specs that failed at the start, passed this time around. The specs passed this time around because :request_store is used in the specs and the changes in the commit involved introducing SafeRequestStore which silently hides/prevent the recursive error from happening.

It is likely feature flags <-> redis causes a never-ending cycle. Perhaps we need to ban files in lib/gitlab/instrumentation from using feature flags.

Relevant MRs:

Proposal

  1. How can we prevent this from happening again ? -- specs added
  2. Re-introduce !105302 (merged) without FF
  3. Investigate which files cannot have Feature flags.

/cc @schin1 @smcgivern

Edited by Sylvester Chin