GitLab Experiment variant caching has errors
Problem
An issue was raised in https://gitlab.com/gitlab-org/gitlab/-/issues/348496#note_856625881 that we saw causing a small number of 500 errors -- which appear to be triggered by some underlying issue inside GitLab Experiment. The exceptions raised can be seen in https://sentry.gitlab.net/gitlab/gitlabcom/?query=is%3Aunresolved+free_indicator
We believe this has something to do with the caching mechanism in GitLab Experiment that seems to be only affecting us when using variant
in the Experiment class and/or when using multiple variants.
We are currently only seeing errors that point back to the new_project_sast_enabled experiment which has both of the conditions above.
Immediate actions
We will take each action only as needed, executing the first action and if our issue is resolved, not move onto the next as they increase in impact/effort.
-
Disable new_project_sast_enabled
and clear experiment cache for all experiments on.com
. @jejacks0n- impact: only
new_project_sast_enabled
experiment as it will turn it off and send it down thecontrol
path.
- impact: only
-
Execute a cleanup/promotion of new_project_sast_enabled
experiment via MR. @jejacks0n to prep and anyone to help it through review if he is offline.- impact: only
new_project_sast_enabled
as it will remove the experiment and promote the 'winner' to production.
- impact: only
-
Turn off all experimentation on .com
via making this method returnfalse
.-
Circuit breaker to be created to enable ability to turn off GLEX experiments - @dstull !81834 (merged) -
Document the circuit breaker @dstull #354107 (closed)
-
- impact: all experiments running on
.com
will be disabled, sending everything down thecontrol
paths.
-
Long term action
Figure out why this is happening and fix it.