Add group importer usage ping
What does this MR do?
This MR adds usage ping to Group Import distinct by user. In order to do that a new user_id
column is required, as well as a data migration for existing records.
Based on https://docs.gitlab.com/ee/development/database/add_foreign_key_to_existing_column.html#data-migration-to-fix-existing-records If we can easily find “invalid” records by doing a simple database query and the record count is not that high, then the data migration can be executed within a Rails migration.
I am adding data migration in the same migration, as dataset is small (less than 1k records) -- see https://gitlab.slack.com/archives/CLJMDRD8C/p1599480187004300 for confirmation.
Follow up NOT NULL constraint validation issue: #250351
Similar MRs !40382 (merged) !40130 (merged)
Migration output
== 20200907092610 AddUserIdToGroupImportStates: migrating =====================
-- column_exists?(:group_import_states, :user_id)
-> 0.0018s
-- add_column(:group_import_states, :user_id, :bigint)
-> 0.0016s
-- transaction_open?()
-> 0.0000s
-- foreign_keys(:group_import_states)
-> 0.0035s
-- execute("ALTER TABLE group_import_states\nADD CONSTRAINT fk_8053b3ebd6\nFOREIGN KEY (user_id)\nREFERENCES users (id)\nON DELETE CASCADE\nNOT VALID;\n")
-> 0.0026s
-- execute("ALTER TABLE group_import_states VALIDATE CONSTRAINT fk_8053b3ebd6;")
-> 0.0040s
-- transaction_open?()
-> 0.0000s
-- index_exists?(:group_import_states, :user_id, {:where=>"user_id IS NOT NULL", :name=>"index_group_import_states_on_user_id", :algorithm=>:concurrently})
-> 0.0022s
-- add_index(:group_import_states, :user_id, {:where=>"user_id IS NOT NULL", :name=>"index_group_import_states_on_user_id", :algorithm=>:concurrently})
-> 0.0036s
== 20200907092610 AddUserIdToGroupImportStates: migrated (0.0274s) ============
== 20200907092715 AddNotNullConstraintToUserOnGroupImportStates: migrating ====
-- transaction_open?()
-> 0.0000s
-- execute("ALTER TABLE group_import_states\nADD CONSTRAINT check_cda75c7c3f\nCHECK ( user_id IS NOT NULL )\nNOT VALID;\n")
-> 0.0004s
== 20200907092715 AddNotNullConstraintToUserOnGroupImportStates: migrated (0.0081s)
== 20200909161624 CleanupGroupImportStatesWithNullUserId: migrating ===========
== 20200909161624 CleanupGroupImportStatesWithNullUserId: migrated (2.8860s) ==
Query Explains
Without timerange
Distinct Count
https://explain.depesz.com/s/4UHv
Aggregate (cost=4.67..4.68 rows=1 width=8) (actual time=0.080..0.081 rows=1 loops=1)
Buffers: shared hit=10
-> Index Only Scan using index_group_import_states_on_user_id on public.group_import_states (cost=0.12..4.67 rows=3 width=8) (actual time=0.027..0.027 rows=0 loops=1)
Index Cond: ((group_import_states.user_id >= 1) AND (group_import_states.user_id <= 10000))
Heap Fetches: 0
Buffers: shared hit=4
Min
https://explain.depesz.com/s/tDnP
Max
https://explain.depesz.com/s/RmxL
With timerange
Distinct Count
https://explain.depesz.com/s/oD4D
Aggregate (cost=6.18..6.19 rows=1 width=8) (actual time=0.025..0.026 rows=1 loops=1)
Buffers: shared hit=1
-> Index Scan using index_group_import_states_on_user_id on public.group_import_states (cost=0.12..6.18 rows=1 width=8) (actual time=0.011..0.011 rows=0 loops=1)
Index Cond: ((group_import_states.user_id >= 1) AND (group_import_states.user_id <= 10000))
Filter: ((group_import_states.created_at >= '2020-08-10 13:10:31.577033+00'::timestamp with time zone) AND (group_import_states.created_at <= '2020-09-07 13:10:31.577103+00'::timestamp with time zone))
Rows Removed by Filter: 0
Buffers: shared hit=1
Min
https://explain.depesz.com/s/ZUwT
Max
https://explain.depesz.com/s/U7iM
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team