You need to sign in or sign up before continuing.
Add timeout support in Projects::ContainerRepository::DeleteTagsService#execute
-
Review changes -
-
Download -
Patches
-
Plain diff

What does this MR do?

MR 1 of #208193 (comment 362910703) MR plan:
Put a timeout around DeleteTagsService#execute but only under those conditions:
- The GitLab Registry is in use. (This is what gitlab.com uses)
- The feature flag
container_registry_expiration_policies_throttling
is enabled - Application Setting
container_registry_delete_tags_service_timeout
set to anything else thannil
or0

Design choices

- For a better understanding how this service is used, see the schema in #208193 (comment 362910703).
- This service function is really simple: it receives a list of tags, it will loop on them and sending the appropriate request to the Container Registry to delete them.
- The main issue is that these requests to the Container Registry can be really slow.
- We can easily imagine what happens when the service receives let's say 100K tags to delete. Well, let's summarize it with
- One upcoming feature is having Container Registry Cleanup Policies open to all projects. The policies are executed (with a background worker) under a given schedule and each execution will use this service to delete a given list of tags.
- To avoid resources starvation, this MR puts an execution time limit to this service.
- When the limit is hit, the loop is stopped.
- As such, we can have a partial execution which is an accepted outcome.
- The default value of 250 seconds has been selected so that we are way below from the 300 seconds threshold that is applied for background workers (see gitlab-com/gl-infra/scalability#461 (comment 370427582))
- To have a better control of this execution time limit value, it is implemented as an application setting (
container_registry_delete_tags_service_timeout
)- The corresponding UI has been added to the container settings in the admin UI (located at
/admin/application_settings/ci_cd
) - This way, we can update this value directly from the UI
- The corresponding UI has been added to the container settings in the admin UI (located at
- As stated above, this MR is part of a larger effort to limit resources usage during Container Registry Cleanup Policies execution.
- This being a deep change from the current behavior, all the limits we are implementing in the larger effort are gated behind a feature flag:
container_registry_expiration_policies_throttling
. This provides us a safety switch between non throttled executions (current implementation) <-> throttled executions. This switch can be used in the case of a production issue. - This MR being the first of the larger effort, it is the one introducing the feature flag.
- This being a deep change from the current behavior, all the limits we are implementing in the larger effort are gated behind a feature flag:
- To be complete in this MR, the corresponding documentation has been updated with a
Caution
block to indicate that Cleanup Policies can be partially executed.

Database Migrations

Up
$ rails db:migrate:up VERSION=20200710113437
== 20200710113437 AddContainerRegistryDeleteTagsServiceTimeoutToApplicationSettings: migrating
-- add_column(:application_settings, :container_registry_delete_tags_service_timeout, :integer, {:default=>250, :null=>false})
-> 0.0054s
== 20200710113437 AddContainerRegistryDeleteTagsServiceTimeoutToApplicationSettings: migrated (0.0054s)
Down
$ rails db:migrate:down VERSION=20200710113437
== 20200710113437 AddContainerRegistryDeleteTagsServiceTimeoutToApplicationSettings: reverting
-- remove_column(:application_settings, :container_registry_delete_tags_service_timeout)
-> 0.0165s
== 20200710113437 AddContainerRegistryDeleteTagsServiceTimeoutToApplicationSettings: reverted (0.0166s)

Screenshots

The admin UI:

Does this MR meet the acceptance criteria?

Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by David Fernandez
Merge request reports
Compare and
- version 18342b00b6
- version 174595c293
- version 167747cbb3
- version 15f53c3d3f
- version 14e980afaa
- version 136ec1026d
- version 126b6e9f0b
- version 111ca0086d
- version 102ac8ef57
- version 93eb048f6
- version 81e7a519c
- version 720a83ed5
- version 699b7eedd
- version 5d86118c0
- version 4e37b5f24
- version 3dd1049c5
- version 2ca2deb94
- version 1e098981c
- master (base)
- latest versionf3ccd73c1 commit,
- version 18342b00b62 commits,
- version 174595c2931 commit,
- version 167747cbb31 commit,
- version 15f53c3d3f3 commits,
- version 14e980afaa2 commits,
- version 136ec1026d1 commit,
- version 126b6e9f0b5 commits,
- version 111ca0086d4 commits,
- version 102ac8ef573 commits,
- version 93eb048f62 commits,
- version 81e7a519c1 commit,
- version 720a83ed51 commit,
- version 699b7eedd3 commits,
- version 5d86118c02 commits,
- version 4e37b5f243 commits,
- version 3dd1049c52 commits,
- version 2ca2deb942 commits,
- version 1e098981c1 commit,
Compare changes
- Side-by-side
- Inline
Files
20Loading