Skip to content

Add Direct Transfer Stats API

George Koltsov requested to merge georgekoltsov/dt-stats-api into master

What does this MR do and why?

This MR adds Direct Transfer Stats API for each imported entity (which is either a group or a project). This is a second iteration of !134536 (closed)

It:

  • Stores 3 db counters
    • source_objects_count - how many items there are on source (pulled from export_relations API)
    • fetched_objects_count - how many items we extracted and read (downloaded the ndjson and read lines)
    • imported_objects_count - how many items we inserted into the db successfully
  • Stores preliminary data in Redis before inserting into DB when tracker is finished/failed
  • Serves counters from Redis until the keys expire - then fallback to DB
  • Shows stats only for file relations (ones that are extracted from ndjson files)
  • Skips empty trackers (where source/fetched/imported counters are 0)
  • Skips skipped trackers

Mentions #435188 (closed)

Query plans

Migration output
main: == [advisory_lock_connection] object_id: 176680, pg_backend_pid: 27945
main: == 20240118103048 AddObjectCountFieldsToBulkImportTrackers: migrating =========
main: -- add_column(:bulk_import_trackers, :source_objects_count, :integer, {:null=>false, :default=>0})
main:    -> 0.0027s
main: -- add_column(:bulk_import_trackers, :fetched_objects_count, :integer, {:null=>false, :default=>0})
main:    -> 0.0008s
main: -- add_column(:bulk_import_trackers, :imported_objects_count, :integer, {:null=>false, :default=>0})
main:    -> 0.0007s
main: == 20240118103048 AddObjectCountFieldsToBulkImportTrackers: migrated (0.0123s) 

main: == [advisory_lock_connection] object_id: 176680, pg_backend_pid: 27945
ci: == [advisory_lock_connection] object_id: 177000, pg_backend_pid: 27947
ci: == 20240118103048 AddObjectCountFieldsToBulkImportTrackers: migrating =========
ci: -- add_column(:bulk_import_trackers, :source_objects_count, :integer, {:null=>false, :default=>0})
ci:    -> 0.0025s
ci: -- add_column(:bulk_import_trackers, :fetched_objects_count, :integer, {:null=>false, :default=>0})
ci:    -> 0.0009s
ci: -- add_column(:bulk_import_trackers, :imported_objects_count, :integer, {:null=>false, :default=>0})
ci:    -> 0.0007s
ci: == 20240118103048 AddObjectCountFieldsToBulkImportTrackers: migrated (0.0207s) 

ci: == [advisory_lock_connection] object_id: 177000, pg_backend_pid: 27947
tracker.update! query
tracker.update!(source_objects_count: 56, fetched_objects_count: 56, imported_objects_count: 56)


  TRANSACTION (0.4ms)  BEGIN /*application:console,db_config_name:main,console_hostname:gk-m1.local,console_username:georgekoltsov,line:(pry):12:in `__pry__'*/
  BulkImports::Tracker Exists? (1.1ms)  SELECT 1 AS one FROM "bulk_import_trackers" WHERE "bulk_import_trackers"."relation" = 'BulkImports::Common::Pipelines::EntityFinisher' AND "bulk_import_trackers"."id" != 4632 AND "bulk_import_trackers"."bulk_import_entity_id" = 181 LIMIT 1 /*application:console,db_config_name:main,console_hostname:gk-m1.local,console_username:georgekoltsov,line:(pry):12:in `__pry__'*/
  BulkImports::Tracker Update (1.1ms)  UPDATE "bulk_import_trackers" SET "updated_at" = '2024-01-18 11:33:02.578129', "source_objects_count" = 56, "fetched_objects_count" = 56, "imported_objects_count" = 56 WHERE "bulk_import_trackers"."id" = 4632 /*application:console,db_config_name:main,console_hostname:gk-m1.local,console_username:georgekoltsov,line:(pry):12:in `__pry__'*/
  TRANSACTION (0.4ms)  COMMIT /*application:console,db_config_name:main,console_hostname:gk-m1.local,console_username:georgekoltsov,line:/lib/gitlab/database.rb:392:in `commit'*/
Update on bulk_import_trackers  (cost=0.15..2.17 rows=0 width=0)
  ->  Index Scan using bulk_import_trackers_pkey on bulk_import_trackers  (cost=0.15..2.17 rows=1 width=26)
        Index Cond: (id = 4632)

Screenshots or screen recordings

image

How to set up and validate locally

  1. Start a new Direct Transfer
  2. Go to /api/v4/bulk_imports/:id/entities and verify stats key is present with populated data

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by George Koltsov

Merge request reports