Change catalog resource last_30_day_usage_count_updated_at default
What does this MR do and why?
Background context
In !155001 (merged), we introduced a new aggregator worker that collects the last 30-day usage count for each row in catalog_resources
daily. It's working as expected, however we observed that it reprocesses the data more often than necessary due to new catalog resources being added throughout the day (ref: #452545 (comment 1951203360)).
Why it happens
The aggregator is executed on each job run unless it meets the stop condition, done_processing?
, which checks if all catalog resources have been "processed" for today:
min_updated_at = TARGET_MODEL.minimum(:last_30_day_usage_count_updated_at)
return true unless min_updated_at
min_updated_at >= today.to_time
When the stop condition is satisfied for today, we want to skip executing the aggregator until tomorrow. However, currently when a new catalog resource is added, its default last_30_day_usage_count_updated_at
value is set to 1970-01-01
, which causes done_processing?
to return false
and the aggregator is executed again. The latter is redundant because we only aggregate usage data from yesterday or older, so the usage_count
of a new catalog resource is always 0
; i.e. there's no need to reprocess the data.
This MR
In this MR, we change the default value of last_30_day_usage_count_updated_at
from 1970-01-01
to NOW()
. With this, done_processing?
will be unaffected by newly added catalog resources, which allows us to avoid the redundant processing as described above.
Database Notes
- When the column (
last_30_day_usage_count_updated_at
) was first introduced, its default was set to1970-01-01
as a way to identify rows that weren't yet updated for the first time. We no longer require this, so it's okay to change. - A multi-release approach is not necessary since we don't explicitly write the old default value to
last_30_day_usage_count_updated_at
anywhere. - Whether the old or new default value is set, it has no negative impact; this change is considered safe.
Resolves #467555 (closed)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Migrations
Up
main: == [advisory_lock_connection] object_id: 127940, pg_backend_pid: 73940
main: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: migrating
main: -- change_column_default(:catalog_resources, :last_30_day_usage_count_updated_at, #<Proc:0x000000012f8b1908 /Users/leaminn/gitlab/gitlab-development-kit/gitlab/db/migrate/20240617210449_change_catalog_resources_last30_day_usage_count_updated_at_default.rb:7 (lambda)>)
main: -> 0.0156s
main: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: migrated (0.0185s)
main: == [advisory_lock_connection] object_id: 127940, pg_backend_pid: 73940
ci: == [advisory_lock_connection] object_id: 128300, pg_backend_pid: 73944
ci: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: migrating
ci: -- change_column_default(:catalog_resources, :last_30_day_usage_count_updated_at, #<Proc:0x0000000173f71df8 /Users/leaminn/gitlab/gitlab-development-kit/gitlab/db/migrate/20240617210449_change_catalog_resources_last30_day_usage_count_updated_at_default.rb:7 (lambda)>)
ci: -> 0.0026s
ci: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: migrated (0.0097s)
ci: == [advisory_lock_connection] object_id: 128300, pg_backend_pid: 73944
Down
main: == [advisory_lock_connection] object_id: 127560, pg_backend_pid: 74684
main: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: reverting
main: -- change_column_default(:catalog_resources, :last_30_day_usage_count_updated_at, "1970-01-01")
main: -> 0.0161s
main: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: reverted (0.0190s)
main: == [advisory_lock_connection] object_id: 127560, pg_backend_pid: 74684
ci: == [advisory_lock_connection] object_id: 127560, pg_backend_pid: 75091
ci: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: reverting
ci: -- change_column_default(:catalog_resources, :last_30_day_usage_count_updated_at, "1970-01-01")
ci: -> 0.0156s
ci: == 20240617210449 ChangeCatalogResourcesLast30DayUsageCountUpdatedAtDefault: reverted (0.0225s)
ci: == [advisory_lock_connection] object_id: 127560, pg_backend_pid: 75091
Related to #467555 (closed)