Skip to content

BulkImports: Reduce memory consumption when importing Epics

What does this MR do?

Avoid creating new pipeline object when the importing dataset has more than one page. Instead, call the BulkImports#run recursively for each page, while there's more page to be processed.

Benchmarks

Benchmark done importing 1 group with 1200 epics.

Although the difference is not big now, going forward, adding more pipelines and for larger datasets I think it might be a big win.

Machine

Processor Name:	8-Core Intel Core i9
Processor Speed:	2.4 GHz
Memory:	32 GB
Rails Boot (about 892Mb)
$ /usr/bin/time -l bundle exec rails runner "exit"
       42.71 real        29.35 user         9.27 sys
           892051456  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              453427  page reclaims
                 247  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                 220  messages sent
                 252  messages received
                  75  signals received
               24381  voluntary context switches
                5581  involuntary context switches
        163393911576  instructions retired
        127900041059  cycles elapsed
           783495168  peak memory footprint
Current approach(master, about 1287Mb)
$ /usr/bin/time -l bundle exec rails runner memory_test.rb
WARNING: Active Record does not support composite primary key.

project_authorizations has composite primary key. Composite primary key is ignored.
      174.90 real        77.49 user         9.86 sys
          1287577600  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              571603  page reclaims
                 287  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
               23230  messages sent
               19286  messages received
                  76  signals received
               29131  voluntary context switches
                3941  involuntary context switches
        415859888049  instructions retired
        298185855038  cycles elapsed
          1081933824  peak memory footprint
New approach (this branch, about 1237Mb => 50Mb less than master)
$ /usr/bin/time -l bundle exec rails runner memory_test.rb
WARNING: Active Record does not support composite primary key.

project_authorizations has composite primary key. Composite primary key is ignored.
      105.44 real        63.04 user         7.99 sys
          1237643264  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              578450  page reclaims
                  18  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
               23207  messages sent
               17166  messages received
                  76  signals received
               15078  voluntary context switches
                2882  involuntary context switches
        369338133219  instructions retired
        253160433949  cycles elapsed
          1034203136  peak memory footprint

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Kassio Borges

Merge request reports