Provide approach that utilizes operational feature flags to toggle and specify experiment rollout. (!67377) · Merge requests · GitLab.org / GitLab

What does this MR do?

I initially identified and fixed an issue, but have since pulled in !66676 (merged) to resolve an issue that was blocking this spike.

This is a spike to use aspects of our Unleash implementation (without using the Unleash client directly) and to utilize the operational feature flags interface to manage toggling and rollout of gitlab-experiment experiments.

The intent here is to open a couple discussions.

The first is directed at the sectiongrowth PM team, to discuss the potential of having:
1. insights into which experiments are rolled out, and where.
2. a history of rollout and ramp up changes.
3. the capability that operational feature flags have for being mentioned in issues.
4. ~~an opportunity to engage in more effective contributions by dogfooding aspects of the unleash server implementation.~~
The second is to discuss what this might mean in terms of infradev, stability, and how we can expose knowledge of changes and when they've been made. As well as to discuss the performance implication of this, and how we might go about caching / invalidating the cache for these Unleash::FeatureToggle instances.

Usage Documentation:

Let's define an experiment we can use for this:

$ rails g gitlab:experiment example control red blue

This will generate an app/experiments/example_experiment.rb and spec for us. Let's open this file and take a look. Ultimately, we're going to tell our experiment that it's going to use operational feature flags, and that we want to use a round robin rollout strategy internally, so the file will look like this:

# frozen_string_literal: true

class ExampleExperiment < ApplicationExperiment
  # The module that will utilize the operational feature flags that can
  # be managed from the gitlab.com interface.
  include OperationalFeatureFlags

  # We take a really predictable rollout approach here, assigning blue
  # first, then red, and then blue again, for every context we determine
  # experiment inclusion for.
  default_rollout :round_robin

  def control_behavior
    :control
  end

  def blue_behavior
    :blue
  end

  def red_behavior
    :red
  end
end

Note: Rollout logic will eventually take on more of the Unleash variant logic that we need to backfill in our server implementation -- but since we already have some of our own rollout strategies, we can use them until we can invest more in our own Unleash server / management capabilities here.

Now, we can go in our development server, and create a gitlab-org/gitlab project if we don't already have one. This is important, as it's the project we'll need to manage feature flags from. Once you've created that project, you should be able to access http://localhost:3000/gitlab-org/gitlab/-/feature_flags -- and from this interface we can define the operational feature flags we want to use for experiments.

Before we do that though, let's look at some things in the rails console:

include Gitlab::Experiment::Dsl
experiment(:example).enabled?
# => false

We can see that the experiment isn't considered enabled yet. We also see that it does a pretty gnarly query, but this is for our caching discussion later.

Ok, let's go ahead and create that operational feature flag now so we can use it. We need to name it the same way we've named out experiment class, so the name will be example_experiment for this case. Using the interface, we can define and setup the strategies something like what I've outlined in:

Note: The feature flag strategy hasn't specified an environment, which means it will be active on "All Environments" -- but since we're using only local records, this is really only true in that it means "All Local Environments" in this case. We could at some point target gprd-cny and gprod independently though, so there's some usefulness in discussing that.

So we've set a 50% rollout percentage. You can set this up in a lot of ways, but for our examples, this will give us a way to verify a couple concepts.

After we've created the operational feature flag, we can see it in the list, with a toggle switch and some details about it how we've defined the rollout strategies.

Ok, let's go back into our console so we can verify a couple things. The first thing we want to verify is that the experiment is now considered enabled. We've created the feature flag and have specified a strategy, so it will be considered enabled. If no strategy is provided, the experiment will still be considered enabled, but would have no inclusions (a 100% control experiment).

experiment(:example).enabled?
# => true

And if we investigate further, we can see that we get different results when we give it different contexts.. and we'll consistently get the same results for the same contexts.

experiment(:example, foo: 1).run
# => :blue
experiment(:example, foo: 2).run
# => :control
# ... more contexts
experiment(:example, foo: 5).run
# => :red

There we go. We have an experiment that's managed via the operational feature flags of the gitlab-org/gitlab project. We can change the project that we look in for these feature flags, but I've just chosen this one as a start.

Does this MR meet the acceptance criteria?

Conformity

I have included changelog trailers, or none are needed. (Does this MR need a changelog?)
I have added/updated documentation, or it's not needed. (Is documentation required?)
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?)
I have self-reviewed this MR per code review guidelines.
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines)
I have followed the style guides.
This change is backwards compatible across updates, or this does not apply.

Availability and Testing

I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.)
I have tested this MR in all supported browsers, or it's not needed.
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

Label as security and @ mention @gitlab-com/gl-security/appsec
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
Security reports checked/validated by a reviewer from the AppSec team

Edited Aug 09, 2021 by Jeremy Jackson

Provide approach that utilizes operational feature flags to toggle and specify experiment rollout.