Skip to content

GraphQL: Add MOST_ACTIVE_DESC sort order to runners resolver

What does this MR do and why?

NOTE: This MR has no changelog because the fields being added to GraphQL are all marked as alpha. The goal is to allow some frontend code to be added without having users tie us to this exact API for the time being.

This MR adds support for the MOST_ACTIVE_DESC sort order in the runners resolver for Ultimate users, so that we can ask for runners based on the count of running jobs that each is processing. This will be used to build the first panel in the CI runner metrics dashboard, to display the 5 busiest runners and links to their details page.

Part of #377324 (closed) and #416763 (closed)

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

image

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Ensure the runner_performance_insights licensed feature is available to you (you need an Ultimate license on the GDK):

    > License.feature_available?(:runner_performance_insights)
    => true
  2. Go to the shell in your GDK gitlab directory and run bundle exec rake "gitlab:seed:runner_fleet". This will seed your GDK with some runners and jobs required for testing this MR.

  3. Run the following command inside the Postgres terminal (gdk psql -d gitlabhq_development_ci):

    INSERT INTO ci_running_builds (build_id, project_id, runner_id, created_at, runner_type, partition_id)
    SELECT ci_builds.id AS build_id, ci_builds.project_id::integer, ci_builds.runner_id,
      ci_builds.created_at::timestamp, runner_type, partition_id
    FROM ci_builds
      INNER JOIN ci_runners ON ci_runners.id = ci_builds.runner_id
    WHERE status = 'running';
  4. On the Rails console, ensure that there are ci_running_builds records for shared runners in the database:

    Ci::RunningBuild.instance_type.count
  5. Open http://gdk.test:3000/-/graphql-explorer

  6. Enter the following query:

    {
      mostActiveRunners: runners(type: INSTANCE_TYPE, first: 5, sort: MOST_ACTIVE_DESC) {
        count
        nodes {
          id
          jobCount
        }
      }
      allRunners: runners(type: INSTANCE_TYPE, first: 10) {
        count
        nodes {
          id
          jobCount
        }
      }
    }

You should see an ordered list of runners in mostActiveRunners, starting from the runner which has most ci_running_builds records associated with it.

  1. Try changing from the type argument from INSTANCE_TYPE to GROUP_TYPE. You should get an error like the following:

    image

  2. Change the type argument back to INSTANCE_TYPE and follow the documentation to simulate a CE instance. You should see a different error this time, as if the enum value did not exist:

    image

Database query plans

SELECT "ci_runners".* FROM "ci_runners" INNER JOIN (
  SELECT "runner_id", ROW_NUMBER() OVER(PARTITION BY "runner_id" ORDER BY "runner_id") as rn
  FROM "ci_running_builds"
) as "limited_builds" ON "limited_builds"."runner_id" = "ci_runners"."id"
                      AND "limited_builds".rn <= 1000 WHERE "ci_runners"."runner_type" = 1 GROUP BY "ci_runners"."id" ORDER BY COUNT(limited_builds.runner_id) DESC NULLS LAST, "ci_runners"."id" DESC LIMIT 5 OFFSET 0;

https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/20303/commands/66373

 Limit  (cost=3768.17..3768.18 rows=1 width=285) (actual time=11.709..11.714 rows=5 loops=1)
   Buffers: shared hit=634
   I/O Timings: read=0.000 write=0.000
   ->  Sort  (cost=3768.17..3768.18 rows=1 width=285) (actual time=11.707..11.711 rows=5 loops=1)
         Sort Key: (count(limited_builds.runner_id)) DESC NULLS LAST, ci_runners.id DESC
         Sort Method: top-N heapsort  Memory: 29kB
         Buffers: shared hit=634
         I/O Timings: read=0.000 write=0.000
         ->  Aggregate  (cost=3768.14..3768.16 rows=1 width=285) (actual time=10.435..11.628 rows=41 loops=1)
               Group Key: ci_runners.id
               Buffers: shared hit=628
               I/O Timings: read=0.000 write=0.000
               ->  Sort  (cost=3768.14..3768.15 rows=1 width=285) (actual time=10.424..10.637 rows=2614 loops=1)
                     Sort Key: ci_runners.id DESC
                     Sort Method: quicksort  Memory: 1444kB
                     Buffers: shared hit=628
                     I/O Timings: read=0.000 write=0.000
                     ->  Hash Join  (cost=68.51..3768.13 rows=1 width=285) (actual time=0.410..7.454 rows=2614 loops=1)
                           Hash Cond: (limited_builds.runner_id = ci_runners.id)
                           Buffers: shared hit=628
                           I/O Timings: read=0.000 write=0.000
                           ->  Subquery Scan on limited_builds  (cost=0.41..3695.64 rows=1672 width=8) (actual time=0.073..5.561 rows=2614 loops=1)
                                 Filter: (limited_builds.rn <= 1000)
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=553
                                 I/O Timings: read=0.000 write=0.000
                                 ->  WindowAgg  (cost=0.41..3632.95 rows=5015 width=16) (actual time=0.072..5.192 rows=2614 loops=1)
                                       Buffers: shared hit=553
                                       I/O Timings: read=0.000 write=0.000
                                       ->  Index Only Scan using index_ci_running_builds_on_runner_id on public.ci_running_builds  (cost=0.41..3545.19 rows=5015 width=8) (actual time=0.041..3.854 rows=2614 loops=1)
                                             Heap Fetches: 2260
                                             Buffers: shared hit=553
                                             I/O Timings: read=0.000 write=0.000
                           ->  Hash  (cost=67.36..67.36 rows=59 width=277) (actual time=0.309..0.309 rows=64 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 31kB
                                 Buffers: shared hit=75
                                 I/O Timings: read=0.000 write=0.000
                                 ->  Index Scan using index_ci_runners_on_runner_type_and_id on public.ci_runners  (cost=0.43..67.36 rows=59 width=277) (actual time=0.036..0.229 rows=64 loops=1)
                                       Index Cond: (ci_runners.runner_type = 1)
                                       Buffers: shared hit=75
                                       I/O Timings: read=0.000 write=0.000

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Vladimir Shushlin

Merge request reports

Loading