Skip to content

Investigate gitlab:doctor:secrets rake task speed up improvements

Problem

The gitlab:doctor:secrets rake task currently iterate through all rows of the database for tables which have encrypted columns.

We implement this rake task at lib/tasks/gitlab/doctor/secrets.rake.

As GitLab usage grows, this becomes much less scalable, and validating secrets become slower and slower. Here is an internal example.

Proposals

As noticed by @mbobin, we're making use of find_each instead of each_batch. each_batch could possibly be more efficient as it doesn't force an order by, and it can be more memory efficient.

Additionally, for the CI::Build records, which is usually the biggest table, we don't need to test decrypting CI_JOB_TOKEN columns which are NULL, and these columns get nullified after every job finishes. So, we might be iterating through records that we don't really need.

It's worth mentioning that this rake task detects encrypted attributes by asking Rails which models have them. This is good because it's future proof. I.e., if the CI_JOB_TOKEN column ever gets renamed, or if another encrypted column gets added, the script would still detect it. So we should not add expectations on column names.

Here's an initial proposal presented by @mbobin to attempt to only iterate through row which have encrypted columns which are not null, in a generic way:

diff --git a/app/models/application_record.rb b/app/models/application_record.rb
index 78274f1d258b..955f6db6c742 100644
--- a/app/models/application_record.rb
+++ b/app/models/application_record.rb
@@ -11,6 +11,7 @@ class ApplicationRecord < ActiveRecord::Base
   include ResetOnColumnErrors
   include HasCheckConstraints
   include IgnorableColumns
+  include EachBatch
 
   self.abstract_class = true
 
diff --git a/lib/gitlab/doctor/secrets.rb b/lib/gitlab/doctor/secrets.rb
index 43f61c85d0e8..e6712e5a4f84 100644
--- a/lib/gitlab/doctor/secrets.rb
+++ b/lib/gitlab/doctor/secrets.rb
@@ -53,11 +53,17 @@ def check_model_attributes(models_with_attributes)
 
         models_with_attributes.each do |model, attributes|
           failures_per_row = Hash.new { |h, k| h[k] = [] }
+          encryped_column_names = attributes.map do |name|
+            model.attr_encrypted_attributes.to_h.dig(name, :attribute) || "#{name}_encrypted"
+          end
 
           with_skipped_callbacks_for(model) do
-            model.find_each do |data|
-              attributes.each do |att|
-                failures_per_row[data.id] << att unless valid_attribute?(data, att)
+            model.each_batch do |batch|
+              batch = encryped_column_names.map { |name| batch.where.not(name => nil) }.reduce(:or)
+              batch.each do |data|
+                attributes.each do |att|
+                  failures_per_row[data.id] << att unless valid_attribute?(data, att)
+                end
               end
             end
           end

We should test if this change could provide us with a performance improvement in speed, which can be validated by either, testing it on a database dump, or simply seeding 600k rows in Ci::Build partitioned tables.

cc @distribution-deploy