`ignored_columns` are being kept around even after the date for safe removal has passed
Premise
We recommend the use of ignored_columns
(Docs) for columns that are to be dropped sometime in the future. This usage makes sure that Rails does not SELECT
these specific columns when making queries. (Explanation)
It's definition is along the lines of
class User < ApplicationRecord
include IgnorableColumns
ignore_column :updated_at, remove_with: '12.7', remove_after: '2020-01-22'
end
Based on the above definition, the column is safe to drop after '2020-01-22' (and this ignore_column
definition can also be dropped afterwards).
However, there currently are ignore_column
definitions in our models for which the remove_after
date has already passed, but the definition (or both the definition and the column) has not been removed yet.
Study
To study how many such columns exists in our models currently, I ran the following code in Rails console
Zeitwerk::Loader.eager_load_all
set = []
ApplicationRecord.descendants.each do |model|
if model.respond_to? :ignored_columns_details
model.ignored_columns_details.each do |key, value|
set << { model: model.name, column: key, passed: value.safe_to_remove? }
end
end
end
Result:
[{:model=>"Ci::JobArtifact", :column=>:original_filename, :passed=>false},
{:model=>"Ci::Trigger", :column=>:ref, :passed=>true},
{:model=>"Ci::Minutes::NamespaceMonthlyUsage", :column=>:new_amount_used, :passed=>false},
{:model=>"Ci::Minutes::ProjectMonthlyUsage", :column=>:new_amount_used, :passed=>false},
{:model=>"Geo::JobArtifactRegistry", :column=>:success, :passed=>true},
{:model=>"ApplicationSetting", :column=>:elasticsearch_shards, :passed=>true},
{:model=>"ApplicationSetting", :column=>:elasticsearch_replicas, :passed=>true},
{:model=>"ApplicationSetting", :column=>:static_objects_external_storage_auth_token, :passed=>true},
{:model=>"ApplicationSetting", :column=>:user_email_lookup_limit, :passed=>true},
{:model=>"Namespace", :column=>:tmp_project_id, :passed=>true},
{:model=>"Group", :column=>:tmp_project_id, :passed=>true},
{:model=>"Namespaces::ProjectNamespace", :column=>:tmp_project_id, :passed=>true},
{:model=>"Namespaces::UserNamespace", :column=>:tmp_project_id, :passed=>true},
{:model=>"Iteration", :column=>:project_id, :passed=>false},
{:model=>"UserPreference", :column=>:experience_level, :passed=>true},
{:model=>"Vulnerabilities::Finding", :column=>:uuid_convert_string_to_uuid, :passed=>false},
{:model=>"MergeRequests::ExternalStatusCheck", :column=>:external_approval_rule_id, :passed=>true},
{:model=>"Analytics::DevopsAdoption::EnabledNamespace", :column=>:last_recorded_at, :passed=>true},
{:model=>"MergeRequestDiffCommit", :column=>:author_name, :passed=>true},
{:model=>"MergeRequestDiffCommit", :column=>:author_email, :passed=>true},
{:model=>"MergeRequestDiffCommit", :column=>:committer_name, :passed=>true},
{:model=>"MergeRequestDiffCommit", :column=>:committer_email, :passed=>true},
{:model=>"PlanLimits", :column=>:ci_max_artifact_size_running_container_scanning, :passed=>true},
{:model=>"ResourceMilestoneEvent", :column=>:reference, :passed=>true},
{:model=>"ResourceMilestoneEvent", :column=>:reference_html, :passed=>true},
{:model=>"ResourceMilestoneEvent", :column=>:cached_markdown_version, :passed=>true},
{:model=>"Namespace::Detail", :column=>:free_user_cap_over_limt_notified_at, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:created_at, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:updated_at, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:author_id, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:cached_markdown_version, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:state, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:title, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:title_html, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:description, :passed=>false},
{:model=>"RequirementsManagement::Requirement", :column=>:description_html, :passed=>false},
{:model=>"Sbom::Source", :column=>:fingerprint, :passed=>false},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_processed_records, :passed=>true},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_runtimes_in_seconds, :passed=>true},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_issues_updated_at, :passed=>true},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_mrs_updated_at, :passed=>true},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_issues_id, :passed=>true},
{:model=>"Analytics::CycleAnalytics::Aggregation", :column=>:last_full_run_merge_requests_id, :passed=>true},
{:model=>"Gitlab::BackgroundMigration::BackfillIntegrationsEnableSslVerification::Integration", :column=>:template, :passed=>true},
{:model=>"Gitlab::BackgroundMigration::BackfillIntegrationsEnableSslVerification::Integration", :column=>:type, :passed=>true},
{:model=>"Gitlab::BackgroundMigration::BackfillIntegrationsEnableSslVerification::Integration", :column=>:properties, :passed=>true},
{:model=>"Analytics::DevopsAdoption::Snapshot", :column=>:segment_id, :passed=>true},
{:model=>"Vulnerabilities::Finding::Evidence", :column=>:summary, :passed=>true}]
From the result:
Across 25 unique models, 48 unique columns are to be dropped, and for 32 of them, the remove_after
date has already been passed.
Some of these dates are set to the year 2020
, but it still remains to be cleaned up.
The usage also appears to be inconsistent: In some models, the ignore_column
definition still remains, but the column appears to have been removed from the database. In some other models, both the definition and the column still remains even after the date has passed.
Ideating solutions:
The major problem appears to be the fact that there isn't a reminder for engineers to actually get back to dropping the column AND removing the usage of ignore_columns
from the model.
One way to setup such a reminder would be to follow the same process as when we define a feature flag: It asks as to setup a rollout issue, and then it becomes our responsibility to clean up the feature flag and close the rollout issue afterwards.
Similarly, could we setup an issue for the removal of ignore_column
and tag it along with the definition, so like:
ignore_column :updated_at, remove_with: '12.7', remove_after: '2020-01-22', removal_issue: 'https://gitlab.com/gitlab-org/gitlab/-/issues/1234'
We can fail the CI if ignore_column
is defined without a removal_issue
, thus making it's use mandatory.
Would such an approach help?