Create partitioned CI component usage table
2 unresolved threads
2 unresolved threads
Compare changes
There are no changes yet
No changes between version 2 and version 2
This MR adds a migration to create p_catalog_resource_component_usages
. This table is used to record unique projects that include a component in their pipeline(s) for each day.
This data will later enable users to sort components (or component projects) by popularity. We require a daily metric because we will need to query the number of unique projects for the last 30-days (rolling window).
Note: Our existing analytics instrumentation services don't readily support our requirements for both GitLab.com
and self-managed instances. Thus we determined that a custom data table would be the simplest to implement. (Ref: !144932 (comment 1783721915))
Table/model notes:
component_id
-used_by_project_id
-used_date
.
used_date
is the format YYYY-mm-dd
. We do not need the time parameters as we're only interested in the number of unique projects per day.monthly
would be the most appropriate partitioning strategy.project_id
instead of used_by_project_id
because the "owner" of the data should be the project that owns the component rather than the one using it. (See !145881 (comment 1798736129).)used_by_project_id
does not have an FK constraint because we'll want to keep historical usage data on a component even if the associated used_by_project is deleted.Resolves Part 1 of 2 of Task Create component usage data table and tracking ... (#443380 - closed). Parent issue: #440382 (closed).
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #440382 (closed)
No changes between version 2 and version 2