Skip to content

Add group importer usage ping

George Koltsov requested to merge georgekoltsov/add-group-import-usage-ping into master

What does this MR do?

This MR adds usage ping to Group Import distinct by user. In order to do that a new user_id column is required, as well as a data migration for existing records.

Based on https://docs.gitlab.com/ee/development/database/add_foreign_key_to_existing_column.html#data-migration-to-fix-existing-records If we can easily find “invalid” records by doing a simple database query and the record count is not that high, then the data migration can be executed within a Rails migration. I am adding data migration in the same migration, as dataset is small (less than 1k records) -- see https://gitlab.slack.com/archives/CLJMDRD8C/p1599480187004300 for confirmation.

Follow up NOT NULL constraint validation issue: #250351

Similar MRs !40382 (merged) !40130 (merged)

Migration output

== 20200907092610 AddUserIdToGroupImportStates: migrating =====================
-- column_exists?(:group_import_states, :user_id)
   -> 0.0018s
-- add_column(:group_import_states, :user_id, :bigint)
   -> 0.0016s
-- transaction_open?()
   -> 0.0000s
-- foreign_keys(:group_import_states)
   -> 0.0035s
-- execute("ALTER TABLE group_import_states\nADD CONSTRAINT fk_8053b3ebd6\nFOREIGN KEY (user_id)\nREFERENCES users (id)\nON DELETE CASCADE\nNOT VALID;\n")
   -> 0.0026s
-- execute("ALTER TABLE group_import_states VALIDATE CONSTRAINT fk_8053b3ebd6;")
   -> 0.0040s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:group_import_states, :user_id, {:where=>"user_id IS NOT NULL", :name=>"index_group_import_states_on_user_id", :algorithm=>:concurrently})
   -> 0.0022s
-- add_index(:group_import_states, :user_id, {:where=>"user_id IS NOT NULL", :name=>"index_group_import_states_on_user_id", :algorithm=>:concurrently})
   -> 0.0036s
== 20200907092610 AddUserIdToGroupImportStates: migrated (0.0274s) ============

== 20200907092715 AddNotNullConstraintToUserOnGroupImportStates: migrating ====
-- transaction_open?()
   -> 0.0000s
-- execute("ALTER TABLE group_import_states\nADD CONSTRAINT check_cda75c7c3f\nCHECK ( user_id IS NOT NULL )\nNOT VALID;\n")
   -> 0.0004s
== 20200907092715 AddNotNullConstraintToUserOnGroupImportStates: migrated (0.0081s) 

== 20200909161624 CleanupGroupImportStatesWithNullUserId: migrating ===========
== 20200909161624 CleanupGroupImportStatesWithNullUserId: migrated (2.8860s) ==

Query Explains

Without timerange

Distinct Count

https://explain.depesz.com/s/4UHv

 Aggregate  (cost=4.67..4.68 rows=1 width=8) (actual time=0.080..0.081 rows=1 loops=1)
   Buffers: shared hit=10
   ->  Index Only Scan using index_group_import_states_on_user_id on public.group_import_states  (cost=0.12..4.67 rows=3 width=8) (actual time=0.027..0.027 rows=0 loops=1)
         Index Cond: ((group_import_states.user_id >= 1) AND (group_import_states.user_id <= 10000))
         Heap Fetches: 0
         Buffers: shared hit=4
Min

https://explain.depesz.com/s/tDnP

Max

https://explain.depesz.com/s/RmxL

With timerange

Distinct Count

https://explain.depesz.com/s/oD4D

 Aggregate  (cost=6.18..6.19 rows=1 width=8) (actual time=0.025..0.026 rows=1 loops=1)
   Buffers: shared hit=1
   ->  Index Scan using index_group_import_states_on_user_id on public.group_import_states  (cost=0.12..6.18 rows=1 width=8) (actual time=0.011..0.011 rows=0 loops=1)
         Index Cond: ((group_import_states.user_id >= 1) AND (group_import_states.user_id <= 10000))
         Filter: ((group_import_states.created_at >= '2020-08-10 13:10:31.577033+00'::timestamp with time zone) AND (group_import_states.created_at <= '2020-09-07 13:10:31.577103+00'::timestamp with time zone))
         Rows Removed by Filter: 0
         Buffers: shared hit=1
Min

https://explain.depesz.com/s/ZUwT

Max

https://explain.depesz.com/s/U7iM

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by George Koltsov

Merge request reports