Skip to content

Produce one query per parent record

Michael Kozono requested to merge mk/improve-parallel-copy-org-pg-data into master

What does this MR do and why?

  • To allow us to maximize parallelization of data transfer, produce one query per parent record with scripts/cells/generate-selects-for-organization.
  • Add semicolon to end of all lines
  • Remove unnecessary repetition of table name in all queries

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

❯ ./scripts/cells/generate-selects-for-organization --help
Usage: generate-selects-for-organization [options]
    -o, --organization-ids IDS       Organization ID(s). Integer, or a comma-separated list of integers.
                                     Required if --root-namespace-ids is not specified.
    -n, --root-namespace-ids IDS     Root namespace ID(s). Integer, or a comma-separated list of integers.
                                     Required if --organization-ids is not specified.
    -p, --psql COMMAND               Default: gdk psql -c
        --debug                      Enable debugging output

Example output for root namespace IDs (truncated, since there are 9000+ lines):

❯ ./scripts/cells/generate-selects-for-organization --root-namespace-ids 1,94
COPY ( SELECT * FROM achievements WHERE namespace_id = 1 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 94 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 95 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 96 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 97 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 98 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 99 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 100 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 101 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM achievements WHERE namespace_id = 102 ) TO STDOUT WITH (FORMAT CSV, HEADER);

... 9000+ lines later ...

COPY ( SELECT * FROM zoom_meetings WHERE project_id = 36 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 37 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 38 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 39 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 40 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 41 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 42 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 43 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 44 ) TO STDOUT WITH (FORMAT CSV, HEADER);
COPY ( SELECT * FROM zoom_meetings WHERE project_id = 45 ) TO STDOUT WITH (FORMAT CSV, HEADER);
Edited by Michael Kozono

Merge request reports