Use GitLab feature flag feature
Investigate using https://gitlab.com/help/user/project/operations/feature_flags that was just released.
Initial configuration
-
gitlab.yml/gitlab.rb? ApplicationSetting? or Environment variables? -
Unleash.app_name
should be distinguished by dev.gitlab.org, staging.gitlab.com, canary.gitlab.com, gitlab.com -
Which GitLab project should be set as config.url
? A project in ops.gitlab.net, gitlab.com, others?
Flexibility - Gate vs Strategy
-
How can we support per-project/group/user gating with unleash? i.e. Feature.enabled?(:a_feature, project)
-
How can we support rollout strategy with project/group/user context? - Flipper has a concept Gate to leverage flags per condition, whereas Unleash has a concept Strategy to leverage flags per condition.
- The statistics of the gate usage today. 41 flags without context. 45 flags with one of project_id, group_id or user_id.
- Today, we encourage employees to use a gate with one of the parameters - project, group or user https://docs.gitlab.com/ee/development/feature_flags/development.html
- Should we allow users to define any strategies in GitLab Feature Flag system
- Should we have custom strategy? e.g. https://gitlab.com/snippets/1890628
Optimization
Fetching
-
ETag cache seems not working yet. With it, unleash-client can skip fetching thus we can reduce network I/O. - Unleash-client fetches flag values per 15sec by default (polling). => It's supported by default.
Reading
-
Flipper suports L1/L2 cache. Does Unleash-ruby-client need to support it? - A recent incident that FF contributed to performance degradation => production#928 (comment 187441674)
- Each
Unleash.is_enabled?
walks thgough strategies and the computed values are not cached yet. i.e. If the same flag is read multiple times in a single thread, the computation happens everytime. - Should unleash-client cache computed flag into
Gitlab::SafeRequestStore
(per-request global ivar)? - Flipper automatically memoize requested flag statuses (
flip.memoize = true
, maybe per-request memoization)
Control/Chatops vs UI
-
GitLab as Unleash server doesn't accept public API support yet, so that employees cannot change flag value via chattops. -
For the quick win, should we allow users to update strategies to any values? (via Public API) e.g. /chatops run feature set new_navigation_bar 25 --dev
(See more https://docs.gitlab.com/ee/development/feature_flags/controls.html) => https://gitlab.com/gitlab-org/gitlab-ee/issues/9566 - GitLab as Unleash server allows employees to change the flag via UI, instead.
- This UI doesn't allow you to set gate parameters (See above)
Monitoring/Logging
-
Monitor unleash-client health. Create a new log file to monitor unleash-client's activity, system failure, etc. It should use structured logging, which can be viewed/indexed in ElasticSearch and Kibana. - Grafana https://dashboards.gitlab.net/d/000000126/grape-endpoints?orgId=1&var-action=Grape%23GET%20%2Fapi%2Ffeature_flags%2Funleash%2F:project_id%2Fclient%2Ffeatures&var-database=influxdb-01-inf-gprd
- Exception tracking is Sentry, as always.
Resiliency/Fallback plan
-
If unleash-client pressurizes the production load and SRE judged we should turn if off immediately, how can we turn it off and fallback to the existing behavior? Feature.enabled?(:unleash_server_enabled)
seems necessary. -
When the polling thread of unleash-client died, how can we recover it without restarting the entire Rails fleet?
Evaluation plan
-
https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7666 - Environments: dev.gitlab.org, staging.gitlab.com, canary.gitlab.com and gitlab.com
Documentation/Education
-
Update documentation to encourage employees to use Unleash-FF for their features https://docs.gitlab.com/ee/development/feature_flags/development.html -
Update documentation about how to control Unleash-FF https://docs.gitlab.com/ee/development/feature_flags/controls.html -
Update rubocop rule to ban Flipper.enabled?
. -
Provide helper methods for rspec i.e. stub_feature_flags
HA
- As long as it's stick with GitLab-Rails, it's automatically HA.
Geo
- If master and slaves looking to the same GitLab as a Unleash server, updated values are automatically synchronized in all nodes (because of polling).
On-prem/Omnibus GitLab
-
Where does it store FF values? In proudction DB? -
Create a hidden project for the control panel? -
Provide a console command (via gitlab-rails console
) to control their flags e.g. https://docs.gitlab.com/ee/administration/job_traces.html#enabling-live-trace
Transition period
-
How do we handle existing Flipper-FF? Should we migrate?`
How the system checks a feature on/off with unleash
sequenceDiagram
participant postgres
participant unleash server
participant unleash client
participant global var
participant local storage
participant FeatureA
loop Polling every 15 sec
unleash client->>unleash server: Request flags api/v4/feature_flags/unleash/:id
activate unleash server
unleash server->>postgres: Retrieving flag data from DB
activate postgres
postgres-->>unleash server: Return flag data
deactivate postgres
unleash server-->>unleash client: Return flag data
deactivate unleash server
unleash client->>local storage: Write flag data as backup file
unleash client->>global var: Write flag data in memory
end
Note left of FeatureA: FeatureA checkes if the flag is on
FeatureA->>unleash client: Unelash.is_enabled?(:feature_a)
activate unleash client
unleash client->>global var: Read flag data from memory
activate global var
unleash client->>global var: Return flag data
deactivate global var
unleash client->>unleash client: Evaluate with strategies
unleash client-->>FeatureA: Return ture/false
deactivate unleash client
How the system checks a feature on/off with flipper
sequenceDiagram
participant flipper
participant ThreadCache
participant redis
participant postgres
participant FeatureA
Note left of FeatureA: FeatureA checks if the flag is on
FeatureA->>flipper:Feature.enabled?(:feature_a)
activate flipper
flipper->>ThreadCache:Try to read flag data
activate ThreadCache
ThreadCache-->>flipper:Return flag data if exist
deactivate ThreadCache
opt if flag data is not found
flipper->>redis:Try to read flag data
activate redis
redis-->>flipper:Return flag data if exist
deactivate redis
flipper->>ThreadCache: Cache flag data
end
opt if flag data is not found
flipper->>postgres:Try to read flag data
activate postgres
postgres-->>flipper:Return flag data if exist
deactivate postgres
flipper->>ThreadCache: Cache flag data
flipper->>redis: Cache flag data
end
flipper->>flipper: Evaluate with gate
flipper-->>FeatureA: Return true/false
deactivate flipper
Slack
f_feature_flag
Edited by Orit Golowinski