Skip to content

[Feature flag] Enable geo_job_artifact_replication

Summary

This issue is to rollout replication and verification of Job Artifacts using the Geo self-service framework on production, that is currently behind the geo_job_artifact_replication feature flag.

Owners

Stakeholders

Expectations

What are we expecting to happen?

When is the feature viable?

What might happen if this goes wrong?

What can we monitor to detect problems with this?

Consider mentioning checks for 5xx errors or other anomalies like an increase in redirects (302 HTTP response status)

What can we check for monitoring production after rollouts?

Consider adding links to check for Sentry errors, Production logs for 5xx, 302s, etc.

Rollout Steps

Rollout on non-production environments

  • Ensure that the feature MRs have been deployed to non-production environments.
    • /chatops run auto_deploy status <merge-commit-of-your-feature>
  • Enable the feature globally on non-production environments.
    • /chatops run feature set <feature-flag-name> true --dev Not enabling as dev is CE only
    • /chatops run feature set <feature-flag-name> true --staging
  • Verify that the feature works as expected. Posting the QA result in this issue is preferable.

Specific rollout on production

  • Ensure that the feature MRs have been deployed to both production and canary.
    • /chatops run auto_deploy status <merge-commit-of-your-feature>
  • If you're using project-actor, you must enable the feature on these entries:
    • /chatops run feature set --project=gitlab-org/gitlab <feature-flag-name> true
    • /chatops run feature set --project=gitlab-org/gitlab-foss <feature-flag-name> true
    • /chatops run feature set --project=gitlab-com/www-gitlab-com <feature-flag-name> true
  • If you're using group-actor, you must enable the feature on these entries:
    • /chatops run feature set --group=gitlab-org <feature-flag-name> true
    • /chatops run feature set --group=gitlab-com <feature-flag-name> true
  • If you're using user-actor, you must enable the feature on these entries:
    • /chatops run feature set --user=<your-username> <feature-flag-name> true
  • Verify that the feature works on the specific entries. Posting the QA result in this issue is preferable.

Preparation before global rollout

  • Check if the feature flag change needs to be accompanied with a change management issue. Cross link the issue here if it does.
  • Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production. If a different developer will be covering, or an exception is needed, please inform the oncall SRE by using the @sre-oncall Slack alias.
  • Ensure that documentation has been updated (More info).
  • Announce on the feature issue an estimated time this will be enabled on GitLab.com.
  • Notify #support_gitlab-com and your team channel (more guidance when this is necessary in the dev docs).

Release Geo support of Cool Widgets

  • In the rollout issue you created when creating the feature flag, modify the Roll Out Steps:

    • Cross out any steps related to testing on production GitLab.com, because Geo is not running on production GitLab.com at the moment.
    • Add a step to Test replication and verification of Cool Widgets on a non-GDK-deployment. For example, using GitLab Environment Toolkit.
    • Add a step to Ping the Geo PM and EM to coordinate testing. For example, you might add steps to generate Cool Widgets, and then a Geo engineer may take it from there.
  • In ee/config/feature_flags/development/geo_cool_widget_replication.yml, set default_enabled: true

  • In ee/app/replicators/geo/cool_widget_replicator.rb, delete the self.replication_enabled_by_default? method:

    module Geo
      class CoolWidgetReplicator < Gitlab::Geo::Replicator
        ...
        # REMOVE THIS LINE IF IT IS NO LONGER NEEDED
        extend ::Gitlab::Utils::Override
    
        ...
        # REMOVE THIS METHOD
        def self.replication_enabled_by_default?
          false
        end
        # REMOVE THIS METHOD
    
        ...
      end
    end
  • In ee/app/graphql/types/geo/geo_node_type.rb, remove the feature_flag option for the released type:

    field :cool_widget_registries, ::Types::Geo::CoolWidgetRegistryType.connection_type,
          null: true,
          resolver: ::Resolvers::Geo::CoolWidgetRegistriesResolver,
          description: 'Find Cool Widget registries on this Geo node',
          feature_flag: :geo_cool_widget_replication # REMOVE THIS LINE
  • Run bundle exec rake gitlab:graphql:compile_docs after the step above to regenerate the GraphQL docs.

  • Add a row for Cool Widgets to the Data types table in Geo data types support

  • Add a row for Cool Widgets to the Limitations on replication/verification table in Geo data types support. If the row already exists, then update it to show that Replication and Verification is released in the current version.

  • Remove overridden registry_consistency_worker_enabled?

  • Check is we affected #334550 (comment 887807869)

Release the feature

After the feature has been deemed stable, the clean up should be done as soon as possible to permanently enable the feature and reduce complexity in the codebase.

You can either create a follow-up issue for Feature Flag Cleanup or use the checklist below in this same issue.

  • Clean up the feature flag from all environments by running these chatops command in #production channel:
    • /chatops run feature delete <feature-flag-name> --dev
    • /chatops run feature delete <feature-flag-name> --staging
    • /chatops run feature delete <feature-flag-name>
  • Close this rollout issue.

Rollback Steps

  • This feature can be disabled by running the following Chatops command:
/chatops run feature set <feature-flag-name> false
Edited by Valery Sizov