[Feature flag] Enable `deprecate_vulnerability_occurrence_pipelines`
Summary
This issue is to roll out the
deprecation of
the vulnerability_occurrence_pipelines table on production, that is
currently behind the deprecate_vulnerability_occurrence_pipelines
feature flag.
Owners
- Most appropriate Slack channel to reach out to:
#g_govern_threat_insights_eng - Best individual to reach out to: @wandering_person
Expectations
What are we expecting to happen?
-
We expect the read traffic on the vulnerability_occurrence_pipelinesto reduce to near zero.- This is because this flag toggles the application code to make use of code paths that do not rely on this table
- This is to simulate dropping the table, which is the ultimate goal
- It is possible there will still be some traffic, as we are still writing to the table in case we need to roll-back
-
We expect to see no increase in sentry alarms related to this flag - On a previous iteration of this flag, toggling it on result in several
sbomrelated queries to timeout - We have updated the
FF: trueversions of those queries to avoid timeouts on this go-around
- On a previous iteration of this flag, toggling it on result in several
What can go wrong and how would we detect it?
This table is mostly used during SBOM ingestion. As mentioned above, on a previous iteration of this feature flag we saw query timeouts in these sbom services. We detected those through Sentry errors.
The data impact should be low, as after toggling the flag back off re-running pipelines or new pipeline runs will go back to running the existing code paths.
That said, any errors related to vulnerability management can damage confidence and trust in our security products. So the impact to customer trust can be quite large.
The places we will be watching are:
Sentry Query
Database statistics for vulnerability_occurrence_pipelines
-
https://dashboards.gitlab.net/goto/kKVr_xeIg?orgId=1
- grafana json to rebuild dashboard if needed: grafanadash.json
SBOM and Vulnerability Ingestion jobs
General Error budgeting dashboards
Rollout Steps
Note: Please make sure to run the chatops commands in the Slack channel that gets impacted by the command.
Rollout on non-production environments
-
Verify the MR with the feature flag is merged to masterand have been deployed to non-production environments with/chatops run auto_deploy status 8a3069d9f6c348f7688869c2ffdb4edac2e4cbe4 -
Enable the feature globally on non-production environments with /chatops run feature set deprecate_vulnerability_occurrence_pipelines true --dev --pre --staging --staging-ref -
Verify that the feature works as expected. - update: #450802 (comment 2079264016)
-
If the feature flag causes end-to-end tests to fail, disable the feature flag on staging to avoid blocking deployments. - See
#qa-stagingSlack channel and look for the following messages:- test kicked off:
Feature flag deprecate_vulnerability_occurrence_pipelines has been set to true on **gstg** - test result:
This pipeline was triggered due to toggling of deprecate_vulnerability_occurrence_pipelines feature flag
- test kicked off:
- See
For assistance with end-to-end test failures, please reach out via the #test-platform Slack channel. Note that end-to-end test failures on staging-ref don't block deployments.
Specific rollout on production
For visibility, all /chatops commands that target production should be executed in the #production Slack channel
and cross-posted (with the command results) to the responsible team's Slack channel.
-
Ensure that the feature MRs have been deployed to both production and canary with /chatops run auto_deploy status 8a3069d9f6c348f7688869c2ffdb4edac2e4cbe4 -
Depending on the type of actor you are using, pick one of these options: - For project-actor:
/chatops run feature set --project=gitlab-org/gitlab,gitlab-org/gitlab-foss,gitlab-com/www-gitlab-com deprecate_vulnerability_occurrence_pipelines true
- For project-actor:
-
Verify that the feature works for the specific actors.
Preparation before global rollout
-
Set a milestone to this rollout issue to signal for enabling and removing the feature flag when it is stable. -
Check if the feature flag change needs to be accompanied with a change management issue. Cross link the issue here if it does. -
Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production. If a different developer will be covering, or an exception is needed, please inform the oncall SRE by using the @sre-oncallSlack alias. -
Leave a comment on the epic announcing estimated time when this feature flag will be enabled on GitLab.com. -
Notify the #support_gitlab-comSlack channel and your team channel (more guidance when this is necessary in the dev docs).Slack message for posting
Hello, team :waves: I will be enabling the `deprecate_vulnerability_occurrence_pipelines` feature flag ([issue](https://gitlab.com/gitlab-org/gitlab/-/issues/450802)) in production for the following `project` actors: * `gitlab-org/gitlab` * `gitlab-org/gitlab-foss` * `gitlab-com/www-gitlab-com` This FF toggles all code paths that previously relied on the `vulnerability_occurrence_pipelines` table for their database queries to alternative code path that accomplish the queries without using that table. This is in preparation to drop the `vulnerability_occurrence_pipelines` table *Things to look out for* This can potentially impact [vulnerability and dependency management features](https://docs.gitlab.com/ee/user/application_security/secure_your_application.html), especially SBOM related features, such as the [dependency list](https://docs.gitlab.com/ee/user/application_security/dependency_list/)
Global rollout on production
For visibility, all /chatops commands that target production should be executed in the #production Slack channel
and cross-posted (with the command results) to the responsible team's Slack channel (#g_govern_threat_insights_eng).
-
Incrementally roll out the feature on production environment. - Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net.
-
Perform actor-based rollout: /chatops run feature set deprecate_vulnerability_occurrence_pipelines 10 --actors -
Perform actor-based rollout: /chatops run feature set deprecate_vulnerability_occurrence_pipelines 25 --actors -
Perform actor-based rollout: /chatops run feature set deprecate_vulnerability_occurrence_pipelines 50 --actors -
Perform actor-based rollout: /chatops run feature set deprecate_vulnerability_occurrence_pipelines 75 --actors -
Perform actor-based rollout: /chatops run feature set deprecate_vulnerability_occurrence_pipelines 100 --actors
-
Enable the feature globally on production environment: /chatops run feature set deprecate_vulnerability_occurrence_pipelines true -
Observe appropriate graphs on https://dashboards.gitlab.net and verify that services are not affected. -
Leave a comment on the epic announcing that the feature has been globally enabled. -
Wait for at least one day for the verification term.
Release the feature
After the feature has been deemed stable, the clean up should be done as soon as possible to permanently enable the feature and reduce complexity in the codebase.
You can either create a follow-up issue for Feature Flag Cleanup or use the checklist below in this same issue.
-
Create a merge request to remove the deprecate_vulnerability_occurrence_pipelinesfeature flag. Ask for review/approval/merge as usual. The MR should include the following changes:- Remove all references to the feature flag from the codebase.
- Remove the YAML definitions for the feature from the repository.
- Create a changelog entry.
-
Clean up the feature flag from all environments by running these chatops command in #productionchannel:/chatops run feature delete deprecate_vulnerability_occurrence_pipelines --dev --pre --staging --staging-ref --production -
Close this rollout issue.
Rollback Steps
-
This feature can be disabled on production by running the following Chatops command:
/chatops run feature set deprecate_vulnerability_occurrence_pipelines false
-
Disable the feature flag on non-production environments:
/chatops run feature set deprecate_vulnerability_occurrence_pipelines false --dev --pre --staging --staging-ref
-
Delete feature flag from all environments:
/chatops run feature delete deprecate_vulnerability_occurrence_pipelines --dev --pre --staging --staging-ref --production