Improve `PipelineProcessService`

changed milestone to %12.2

added database groupcloud connector + 1 deleted label

assigned to @ayufan

added bugperformance maintenancerefactor labels

cc @fabiopitino @erushton @dosuken123 @stanhu

changed the description

If you need an example .gitlab-ci.yml that exhibits bad behavior, see https://gitlab.com/gitlab-org/gitlab-ce/issues/65414#note_198538890.

added devopssystems label

added Deliverable label

removed Deliverable label

PipelineProcessWorker:
  - exclusive lease on the given pipeline, ignore if it is running, => this is our de-duplication
  - execute: PipelineProcessService:
    - -1. Time.now
    - 0. ci_builds.processing + ci_builds.lock_version ->
    - 1. is this complete status? => 2. process the pipeline from the subsequent stages that these build, all needs for DAG
    - 2. is this incomplete status? => it is not relevant
    - 3. update all stages affected by processing,
    - 4. update the pipeline,
    - 5. make `processing=false` with matching `lock_version` on the ones received in 0.
  - if there's any new processing, schedule itself
---
after_transition do
  PipelineProcessWorker.perform_in(500.milliseconds)
end

changed milestone to %12.3

removed milestone

changed milestone to %12.3

added Background Processing label

Note that I think Sidekiq Enterprise's Unique Jobs probably works similarly to checking for an exclusive lease: https://github.com/mperham/sidekiq/wiki/Ent-Unique-Jobs

The current plan for this work is:

Finish #15144 (closed) via https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31783,
Implement ci_builds.processing,
Implement workflow defined in: https://gitlab.com/gitlab-org/gitlab-ce/issues/65538#note_204508591, to close #16170 (closed),
Close this issue as the 1.-3. will resolve it.

We plan to finish that in %12.3, but this is very big overhaul of the pipeline processing, so it is being done in steps.

@craig-gomes I think that we should move this to next release, as I focus now on getting this finished: https://gitlab.com/gitlab-org/gitlab-ee/issues/15144

changed milestone to %12.4

changed milestone to %12.3

changed milestone to %12.4

Large customer in https://gitlab.zendesk.com/agent/tickets/131785 has shared the output of:

SELECT *
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 100;

As described by the customer:

The query you specified does happen quite a few times, but there are others in the list as well. Not sure if the other queries also need to be further optimised.

This customer has reported the issue since upgrading to 12.1.

Another customer also reported issues with this and they are on 11.11: https://gitlab.zendesk.com/agent/tickets/130832 (internal use only)

The customer on zendesk ticket no 130832 have attached the pg_stat_statements output as well.

In their particular case, they have a pipeline with 1000+ jobs running every day. They are observing strange behavior where the status of the pipeline is 15 to 60 seconds behind. As in the job is already running but the status of the pipeline/job is still pending on the Web UI

added customer label

added workflowin dev label

added Deliverable label

mentioned in merge request !16808 (merged)

marked this issue as related to #16170 (closed)

Improve `PipelineProcessService`

Summary

Measurements

Roll up summary

Designs

Child items ...

Activity

Improve `PipelineProcessService`

Summary

Measurements

Roll up summary

Relates to

Activity