[Feature flag] Rollout of `malicious_packages_dependency_list_filtering`
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
This issue tracks the rollout of the malware filter argument on the Dependency GraphQL APIs on production, which is currently behind the malicious_packages_dependency_list_filtering feature flag.
- Feature flag name:
malicious_packages_dependency_list_filtering - Feature flag type:
wip - Default enabled:
false - Introduced in: !235800 (merged)
- Feature issue: #587758
- Parent epic: &20573
Owners
- Team:
group::security infrastructure - Most appropriate Slack channel to reach out to:
#g_security-infrastructure - Best individual to reach out to: @bala.kumar
Expectations
What are we expecting to happen?
- When the flag is disabled (default): the
malwareargument onQuery.project.dependenciesandQuery.group.dependenciesraisesGitlab::Graphql::Errors::ArgumentErrorwith the message"The malware filter is not available.". - When the flag is enabled: the argument is accepted by the resolver. Until the finder integration follow-up lands, the argument is a no-op and dependencies are returned unfiltered.
- Once the finder integration is implemented in a follow-up MR, enabling the flag will activate actual filtering of dependencies by malware status (CWE-506 /
GLAM-*identifier prefix), gated by the SSCS add-on license check.
What can go wrong and how would we detect it?
- The GraphQL schema introspection picks up the new experimental argument: detectable via existing GraphQL schema snapshot specs.
- Once filtering is wired in, the PG join on identifiers may hit performance limits (&17619). Monitor:
- SQL slow log entries on
sbom_occurrencesjoins. - Apdex / latency on the dependency list endpoints in Grafana.
- Sidekiq / web error rates on
Resolvers::Sbom::DependenciesResolverandResolvers::Sbom::DependencyAggregationResolver.
- SQL slow log entries on
- Rollback by disabling the flag immediately restores prior behaviour (no schema breakage since the argument is
experiment).
Rollout Steps
Note: Please make sure to run the chatops commands in the Slack channel that gets impacted by the command.
Rollout on non-production environments
-
Verify the MR with the feature flag is merged to
masterand has been deployed to non-production environments with/chatops gitlab run auto_deploy status <merge-commit-of-your-feature> -
Deploy the feature flag at a percentage (recommended percentage: 50%) with
/chatops gitlab run feature set malicious_packages_dependency_list_filtering 50 --actors --dev --pre --staging --staging-ref -
Monitor that the error rates did not increase (repeat with a different percentage as necessary).
-
Enable the feature globally on non-production environments with
/chatops gitlab run feature set malicious_packages_dependency_list_filtering true --dev --pre --staging --staging-ref -
Verify that the feature works as expected on
staging-canary.
Before production rollout
- Ensure the finder integration follow-up has been merged and deployed before enabling on production groups.
- Coordinate with @dpisek and the frontend (#587762 (closed)) team before flipping on customer-facing groups.
Specific rollout on production
For visibility, all /chatops commands that target production must be executed in the #production Slack channel and cross-posted (with the command results) to the responsible team's Slack channel.
- Enable for
gitlab-organdgitlab-comfirst:/chatops gitlab run feature set --group=gitlab-org,gitlab-com malicious_packages_dependency_list_filtering true
- Verify that the feature works for the specific actors via GraphiQL on
Query.project.dependencies(malware: true)andQuery.group.dependencies(malware: true).
Preparation before global rollout
- Set a milestone to this rollout issue to signal for enabling and removing the feature flag when it is stable.
- Ensure documentation is in place for the
malwareargument once filtering is wired up. - Notify the
#support_gitlab-comSlack channel and#g_security-infrastructure.
Global rollout on production
- Incrementally roll out the feature on production:
/chatops gitlab run feature set malicious_packages_dependency_list_filtering 25 --actors/chatops gitlab run feature set malicious_packages_dependency_list_filtering 50 --actors/chatops gitlab run feature set malicious_packages_dependency_list_filtering 100 --actors- Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net.
- After the feature has been 100% enabled, wait for at least one day before releasing the feature.
Release the feature
- Create a merge request to remove the
malicious_packages_dependency_list_filteringfeature flag. The MR should:- Remove all references to the feature flag from the codebase (including
validate_malware_filter!inResolvers::Sbom::DependencyInterfaceResolver). - Remove the YAML definition
config/feature_flags/wip/malicious_packages_dependency_list_filtering.yml. - Remove the
experiment: { milestone: '19.0' }marker from themalwareargument once stable.
- Remove all references to the feature flag from the codebase (including
- Close the feature issue.
- Once the cleanup MR has been deployed to production, clean up the feature flag from all environments:
/chatops gitlab run feature delete malicious_packages_dependency_list_filtering --dev --pre --staging --staging-ref --production - Close this rollout issue.
Rollback Steps
- This feature can be disabled on production by running:
/chatops gitlab run feature set malicious_packages_dependency_list_filtering false- Disable the feature flag on non-production environments:
/chatops gitlab run feature set malicious_packages_dependency_list_filtering false --dev --pre --staging --staging-ref- Delete feature flag from all environments:
/chatops gitlab run feature delete malicious_packages_dependency_list_filtering --dev --pre --staging --staging-ref --production