Setup background operation scheduler

What does this MR do and why?

Creates scheduler classes for each database (main, ci, sec), which picks up executable workers and sends off to the orchestrator.

Database Review

Query:

For schedulable_workers(5) across 11 partitions - query.

SELECT
    id,
    PARTITION
FROM ((
          SELECT
              id, PARTITION, created_at
          FROM
              gitlab_partitions_dynamic.background_operation_workers_1 AS background_operation_workers
          WHERE ("background_operation_workers"."status" IN (0, 2))
            AND (on_hold_until IS NULL
              OR on_hold_until < NOW())
          ORDER BY
              "background_operation_workers"."created_at" ASC
          LIMIT 5)
      UNION ALL (
          SELECT
              id, PARTITION, created_at
          FROM
              gitlab_partitions_dynamic.background_operation_workers_2 AS background_operation_workers
          WHERE ("background_operation_workers"."status" IN (0, 2))
            AND (on_hold_until IS NULL
              OR on_hold_until < NOW())
          ORDER BY
              "background_operation_workers"."created_at" ASC
          LIMIT 5)
      ...
      ...
      UNION ALL (
          SELECT
              id, PARTITION, created_at
          FROM
              gitlab_partitions_dynamic.background_operation_workers_11 AS background_operation_workers
          WHERE ("background_operation_workers"."status" IN (0, 2))
            AND (on_hold_until IS NULL
              OR on_hold_until < NOW())
          ORDER BY
              "background_operation_workers"."created_at" ASC
          LIMIT 5)) background_operation_workers
ORDER BY
    "background_operation_workers"."partition" ASC, "background_operation_workers"."created_at" ASC
LIMIT 5;

Query plan

  1. Create 10 more partitions using - query

    DO $$
    BEGIN
      FOR i IN 2..11 LOOP
        EXECUTE format('CREATE TABLE IF NOT EXISTS gitlab_partitions_dynamic.background_operation_workers_%s
          PARTITION OF public.background_operation_workers FOR VALUES IN (%L)', i, i::text);
      END LOOP;
    END
    $$;
  2. Data population: Created 100000 records per partitions, with equal distribution between 5 tables and atleast 20% paused workers (10% with on_hold_until in the past and other 10% with on_hold_until in the future) - query.

    DO $$
    DECLARE
        i int;
        table_names text[] := ARRAY['users', 'projects', 'issues', 'notes', 'namespaces'];
        records_per_partition int := 100000;
        records_per_table int := 20000;
    BEGIN
        FOR i IN 1..11 LOOP
            INSERT INTO public.background_operation_workers (partition, organization_id, user_id, created_at, batch_size, sub_batch_size, interval, job_class_name, batch_class_name, table_name, column_name, gitlab_schema, status, on_hold_until, min_cursor, max_cursor)
            SELECT
                i,
                (i * 1000000) + rc,
                (random() * 1000000)::bigint,
                CURRENT_TIMESTAMP - (random() * interval '30 days'),
                1000,
                100,
                120,
                'BackgroundWorker',
                'BackgroundJob',
                table_names[((rc - 1) / records_per_table) + 1],
                'id',
                'gitlab_main',
                CASE WHEN (rc - 1) % records_per_table < (records_per_table * 0.1)::int THEN
                    2
                WHEN (rc - 1) % records_per_table < (records_per_table * 0.2)::int THEN
                    2
                ELSE
                    ((rc - 1) % records_per_table - (records_per_table * 0.2)::int) % 5
                END,
                CASE WHEN (rc - 1) % records_per_table < (records_per_table * 0.1)::int THEN
                    CURRENT_TIMESTAMP - interval '2 days'
                WHEN (rc - 1) % records_per_table < (records_per_table * 0.2)::int THEN
                    CURRENT_TIMESTAMP + interval '2 days'
                END,
                jsonb_build_array(1),
                jsonb_build_array(1000000)
            FROM
                generate_series(1, records_per_partition) AS rc;
            RAISE NOTICE 'Inserted % records into partition %', records_per_partition, i;
        END LOOP;
    END
    $$;
  3. Query plan - https://console.postgres.ai/gitlab/gitlab-production-main/sessions/44989/commands/138116

    Time: 5.501 ms
      - planning: 4.272 ms
      - execution: 1.229 ms
        - I/O read: 0.000 ms
        - I/O write: 0.000 ms
    
    Shared buffers:
      - hits: 144 (~1.10 MiB) from the buffer pool
      - reads: 0 from the OS file cache, including disk I/O
      - dirtied: 0
      - writes: 0

References

Issue: #578054 (closed)

How to set up and validate locally

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #578054 (closed)

Edited by Prabakaran Murugesan

Merge request reports

Loading