Skip to content
  • Manoj M J's avatar
    Add sharding key tracking issues for continuous_integration · 1f400733
    Manoj M J authored and Max Fan's avatar Max Fan committed
    Add sharding key tracking issues for feature category `continuous_integration`.
    
    These tables were unable to be classified automatically, and will require manual input. Eventually all tables will
    need to be correctly classified, but we understand that this will be complex for some tables and completing these
    will take time. Instead, our goal for this task is to ensure all remaining tables are tracked in an issue, and to
    classify any straightforward cases that our automation may have missed (options 1 and 2 below).
    
    We have assigned a random backend engineer from ~"group::pipeline execution" as the initial DRI for this task, as well as an
    engineering manager for visibility. Please note that we are not requesting a large time commitment, creating one
    issue and linking it for all tables is perfectly acceptable.
    
    When you are finished, please assign to the ~database reviewer/maintainer suggested by Danger.
    
    If you have any questions or concerns, reach out to `#g_tenant-scale`.
    
    For each table, please select one of the following options:
    
    This option is best suited to tables whose sharding behaviour is unknown, or will require additional work before
    a sharding key can be defined.
    
    Replace the `TODO` in the dictionary file with a link to an issue in the gitlab-org/gitlab project.
    
    ```diff
    - sharding_key_issue_url: TODO
    + sharding_key_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/1234
    ```
    
    You can create a new issue or link an existing one, and multiple entries can refer to the same issue. These issues will
    be used to track the work remaining on the [progress dashboard](https://cells-progress-tracker-gitlab-org-tenant-scale-g-f4ad96bf01d25f.gitlab.io/sharding_keys).
    
    If you are creating a new issue, you can copy over the following contents to the issue description:
    
    <details><summary>Click to expand</summary>
    
      Issue Title: Set sharding keys for tables in 'group::pipeline execution'
    
      Issue Description:
    
      Sharding keys need to be set for the tables: ci_build_pending_states, ci_build_trace_chunks, ci_build_trace_metadata, ci_builds, ci_partitions, ci_pipeline_chat_data, ci_pipeline_messages, ci_pipeline_schedule_variables, ci_pipelines, ci_pipelines_config, ci_platform_metrics, ci_stages, ci_trigger_requests, p_ci_builds, p_ci_pipeline_variables, p_ci_stages, taggings, tags
    
      This involves choosing one of the following, based on the intended behaviour of the table:
      - **The table is not cell-local**
        - Set `gitlab_schema` to `gitlab_main_clusterwide`.
      - **The table is cell-local and requires a sharding key**
        - Set `gitlab_schema` to `gitlab_main_cell`
        - Add a `sharding_key` or `desired_sharding_key` configuration. If the configuration is known but the chosen key
          doesn't yet meet not-null and foreign key requirements, you can add an exception to
          `allowed_to_be_missing_not_null` or `allowed_to_be_missing_foreign_key` to get the pipeline passing. Please
          link to a follow-up issue in a code comment next to the exception.
        - You may also need to set `allow_cross_joins`, `allow_cross_transactions` and `allow_cross_foreign_keys` if changing
          the schema causes pipeline failures. See [`db/docs/epics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/epics.yml?ref_type=heads#L12-17)
          for an example.
      - **The table is cell-local and does not require a sharding key**
        - Set `gitlab_schema` to `gitlab_main_cell` and
        - Set `exempt_from_sharding` to `true`.
    
      ### Documentation
    
      - [Choosing either the gitlab_main_cell or gitlab_main_clusterwide schema](https://docs.gitlab.com/ee/development/database/multiple_databases.html#choose-either-the-gitlab_main_cell-or-gitlab_main_clusterwide-schema)
      - [Defining a sharding key for all cell-local tables](https://docs.gitlab.com/ee/development/database/multiple_databases.html#defining-a-sharding-key-for-all-cell-local-tables)
      - [Defining a desired_sharding_key to automatically backfill a sharding_key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key)
    
    </details>
    
    This option is best suited to tables with an easily identifiable sharding key that will require minimal work to
    define.
    
    Remove `sharding_key_issue_url` from the dictionary file and instead complete the classification for the table.
    This involves choosing one of the following, based on the intended behaviour of the table:
    - **The table is not cell-local**
      - Set `gitlab_schema` to `gitlab_main_clusterwide`.
    - **The table is cell-local and requires a sharding key**
      - Set `gitlab_schema` to `gitlab_main_cell`
      - Add a `sharding_key` or `desired_sharding_key` configuration. If the configuration is known but the chosen key
        doesn't yet meet not-null and foreign key requirements, you can add an exception to
        `allowed_to_be_missing_not_null` or `allowed_to_be_missing_foreign_key` to get the pipeline passing. Please
        link to a follow-up issue in a code comment next to the exception.
      - You may also need to set `allow_cross_joins`, `allow_cross_transactions` and `allow_cross_foreign_keys` if changing
        the schema causes pipeline failures. See [`db/docs/epics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/epics.yml?ref_type=heads#L12-17)
        for an example.
    - **The table is cell-local and does not require a sharding key**
      - Set `gitlab_schema` to `gitlab_main_cell` and
      - Set `exempt_from_sharding` to `true`.
    
    - [Choosing either the gitlab_main_cell or gitlab_main_clusterwide schema](https://docs.gitlab.com/ee/development/database/multiple_databases.html#choose-either-the-gitlab_main_cell-or-gitlab_main_clusterwide-schema)
    - [Defining a sharding key for all cell-local tables](https://docs.gitlab.com/ee/development/database/multiple_databases.html#defining-a-sharding-key-for-all-cell-local-tables)
    - [Defining a desired_sharding_key to automatically backfill a sharding_key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key)
    
    Related to https://gitlab.com/gitlab-org/gitlab/-/issues/455137
    
    This change was generated by
    [gitlab-housekeeper](https://gitlab.com/gitlab-org/gitlab/-/tree/master/gems/gitlab-housekeeper)
    using the Keeps::AddShardingKeyTrackingIssues keep.
    
    To provide feedback on your experience with `gitlab-housekeeper` please comment in
    <https://gitlab.com/gitlab-org/gitlab/-/issues/442003>.
    
    Changelog: other
    1f400733