Drop CI Trigger Requests

The Problem

Ci::TriggerRequest has used about 70% of its primary keys, and is a concern for exhausting the int column and blocking inserts.

A Proposal

The model serves no functional purpose since having the variables removed. We should link the Ci::Trigger directly to the Ci::Pipeline record and drop the whole model and table from the application

From &15619 (comment 2188441602):

@drew: I think we should consider dropping this model from the application.

Here's the information it owns:
CREATE TABLE ci_trigger_requests (
    id bigint NOT NULL,
    trigger_id bigint NOT NULL,
    variables text,
    created_at timestamp without time zone,
    updated_at timestamp without time zone,
    commit_id bigint,
    project_id bigint
);
The only piece of information in this table that isn't also stored on the Pipeline is variables, and we explicitly do not store variables in this model anymore:
    # We switched to Ci::PipelineVariable from Ci::TriggerRequest.variables.
    # Ci::TriggerRequest doesn't save variables anymore.
    validates :variables, absence: true
@mbobin @avielle Do either of you know of a justification for keeping this model around? It's linked directly to ci_builds instead of ci_pipelines for a reason that is unknown to me.

Both of these proposals seem totally viable to me:

Drop the model, link Ci::Trigger directly to Ci::Build.

More aggressively, link Ci::Trigger to Ci::Pipeline, drop Ci::TriggerRequest AND the trigger_request_id FK from Ci::Build.

@mbobin: I'd go with the second option to reduce the width of the builds table. All the jobs from the pipeline inherit the trigger_request relation, even the retries, so we'll store less data if we link the trigger to the pipeline.

@avielle: I don't have a reason to keep it around 💡 Then again, I'm not sure I've ever seen it before 😅

@tianwenchen: Yeah, I can't see why we shouldn't do that, it looks like it's a 1 to 1 between trigger and pipeline. The prod data shows one exception, but I think it's an invalid record.

For the record, I've listed the child items in delivery order. There's theoretically some work in the removals near the end that can be done in parallel, because nothing will be operational at that point and we'll just be dropping unused data from the database. But the GitLab.com Resource Saturation items at the top are the urgent pieces that need to be scheduled first, and we can worry about scheduling the rest after those are closed out.

Edited Dec 06, 2024 by drew stachon