Skip to content

Create related package events during cleanup policy execution

David Fernandez requested to merge 276479-add-expiration-policies-events into master

🍍 Context

In an upcoming gradual architectural migration that we will do on the Container Registry, we will need a way to quickly identify the container repositories that have been recently completely cleaned up by cleanup policies. This is tracked by #276479 (closed).

On the other hand, recording the activity of a cleanup policy execution (start, end and stop), could be useful for:

  • debugging
  • logging/monitoring
  • unlock the possibility of having a cleanup policy activity tracker on the UI

Right now, the only tracing feature we have for cleanup policies are:

  • The expiration_policy_started_at attribute on the container repository.
  • The expiration_policy_cleanup_status attribute on the container repository.

That's nice but it's only related to the last execution and doesn't even record when was the last time that a cleanup has been successfully executed.

🗺 Workers / Services schema

Here is an overview (simplified) of the current rails worker / backend components.

graph TD
parent([ContainerExpirationPolicyWorker]) --feature flag off--> ceps(ContainerExpirationPolicyService)
parent --feature flag on--> cep_ccrw([ContainerExpirationPolicies::CleanupContainerRepositoryWorker])
cep_ccrw --> cep_cleanup_service(ContainerExpirationPolicies::CleanupService)
cep_cleanup_service --> cts
ceps --> ccrw([CleanupContainerRepositoryWorker])
api[[API::ProjectContainerRepositories<br />API endpoint]] --> ccrw
ccrw --> cts(Projects::ContainerRepository::CleanupTagsService)
cts --> dts(Projects::ContainerRepository::DeleteTagsService)
dts --> crc(ContainerRegistry::Client)

Here is the general flow regarding the tracking aspects:

  • The worker ContainerExpirationPolicyWorker handles the expiration_policy_started_at updates.
  • The service ContainerExpirationPolicies::CleanupService handles the expiration_policy_started_at and expiration_policy_cleanup_status updates.
  • Both workers handles the expiration_policy_cleanup_status updates.

As you can see, the updates are all over the place.

Also note that the tracking attributes are only updated when the cleanup is executed within the cleanup policy context. In particular, when the API uses the CleanupContainerRepositoryWorker, we should not see any updates in the container repository attributes.

🤔 What does this MR do?

  • Update the Packages::Event table with a foreign key belongs_to to ContainerRepository.
  • Centralize all the tracking aspects (including updates on expiration_policy_started_at and expiration_policy_started_at) of a cleanup policy execution in a single service: ContainerExpirationPolicies::TrackingService. This service is in charge of:
    • Updating the tracking attributes (expiration_policy_started_at and expiration_policy_started_at) of the container repository given the action passed as a parameter.
    • Record a new Packages::Event with the proper parameters.
    • Record a new ::Gitlab::Tracking event with the proper parameters.
  • Cleanup code in charge of updating tracking attributes of a container repository.
    • Two workers were previously handling this.
    • Remove the ContainerExpirationPolicies::CleanupService. With the removal of the tracking aspect, this service became an empty "shell" that would just call an other service(Projects::ContainerRepository::CleanupTagsService). We don't really need this kind of indirection that doesn't provide any value.

📈 Packages::Event growth

Here are some Kibana dashboards to have a sense of how many events we're going to insert in Packages::Event:

  • start events: dashboard. Peak at ~9K per day
  • end and stop events: dashboard. Peak at ~9K per day

Screenshots (strongly suggested)

n / a

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

💽 Database

Migration up

== 20201202154739 AddContainerRepositoryIdToPackagesEvents: migrating =========
-- add_column(:packages_events, :container_repository_id, :bigint, {:null=>true})
   -> 0.0011s
== 20201202154739 AddContainerRepositoryIdToPackagesEvents: migrated (0.0011s) 

== 20201202162154 AddForeignKeyContainerRepositoryIdToPackagesEvents: migrating 
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_events, :container_repository_id, {:name=>"index_packages_events_on_container_repository_id", :algorithm=>:concurrently})
   -> 0.0016s
-- execute("SET statement_timeout TO 0")
   -> 0.0002s
-- add_index(:packages_events, :container_repository_id, {:name=>"index_packages_events_on_container_repository_id", :algorithm=>:concurrently})
   -> 0.0063s
-- execute("RESET ALL")
   -> 0.0002s
-- transaction_open?()
   -> 0.0000s
-- foreign_keys(:packages_events)
   -> 0.0025s
-- execute("ALTER TABLE packages_events\nADD CONSTRAINT fk_fb29ba361a\nFOREIGN KEY (container_repository_id)\nREFERENCES container_repositories (id)\nON DELETE CASCADE\nNOT VALID;\n")
   -> 0.0009s
-- execute("ALTER TABLE packages_events VALIDATE CONSTRAINT fk_fb29ba361a;")
   -> 0.0012s
== 20201202162154 AddForeignKeyContainerRepositoryIdToPackagesEvents: migrated (0.0175s) 

Migration down

== 20201202162154 AddForeignKeyContainerRepositoryIdToPackagesEvents: reverting 
-- foreign_keys(:packages_events)
   -> 0.0033s
-- remove_foreign_key(:packages_events, {:column=>:container_repository_id})
   -> 0.0026s
-- transaction_open?()
   -> 0.0000s
-- indexes(:packages_events)
   -> 0.0021s
-- execute("SET statement_timeout TO 0")
   -> 0.0001s
-- remove_index(:packages_events, {:algorithm=>:concurrently, :name=>"index_packages_events_on_container_repository_id"})
   -> 0.0021s
-- execute("RESET ALL")
   -> 0.0002s
== 20201202162154 AddForeignKeyContainerRepositoryIdToPackagesEvents: reverted (0.0141s) 

== 20201202154739 AddContainerRepositoryIdToPackagesEvents: reverting =========
-- remove_column(:packages_events, :container_repository_id, :bigint, {:null=>true})
   -> 0.0009s
== 20201202154739 AddContainerRepositoryIdToPackagesEvents: reverted (0.0051s) 
Edited by David Fernandez

Merge request reports