Skip to content

fast forward merge requests: database transaction timeout in update_merge_sha_metadata causes merge request to get stuck

Summary

Customer reported that a merge request had failed to close, although the change did get merged to the target branch master. This happened on GitLab.com

  • GitLab team members can read more in the ticket

  • Reviewing their MR, we could see the merge commit generated by the merge operation ping back to the system notes on the MR

  • No message #person merged #time ago is logged

  • The merge request shows as open and with no changes or commits

    • this looks to be in part because the developer may have rebased their branch and pushed it
    • Additionally the merge request code would see its changes on the target branch 🤷
  • We think a pipeline should have triggered on the merge request, but either that wasn't triggered or Rails flagged the whole pipeline as skipped. There is a skipped pipeline on the MR, but it's for the wrong commit, so we suspect the code didn't get as far as running the post-merge pipeline.

  • No error messages in Gitaly logs; 100% of the events for this correlation ID are level: info

  • We found a failure in sidekiq MergeWorker JID-44d9064962bd9fa5d8845868: fail: 16.600004 sec (stack trace below)

    json.exception.message: PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
    json.db_duration_s: 15.264
    json.db_main_duration_s: 15
    json.db_ci_duration_s: 0.003
    json.db_primary_duration_s: 15.271

The last call in the stack trace prior to the lib/gitlab/database calls was in update_merge_sha_metadata at merge_request.update!(**data_to_update) if data_to_update.present?

This code changed quite recently in: !97871 (merged)

    def update_merge_sha_metadata(commit_id)
      data_to_update = merge_success_data(commit_id)
      data_to_update[:squash_commit_sha] = source if merge_request.squash_on_merge?

      merge_request.update!(**data_to_update) if data_to_update.present?
    end

    def merge_success_data(commit_id)
      { merge_commit_sha: commit_id }
    end

Steps to reproduce

Cause the database call in merge_request.update to fail

Example Project

What is the current bug behavior?

a database error updating the merge request isn't handled and will cause the merge worker to terminate. the MR is then left in the wrong state.

What is the expected correct behavior?

  • roll back?
  • retry?
  • merge request ends in a recoverable state?

Relevant logs and/or screenshots

lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
 lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
 lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
 lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
 lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
 lib/gitlab/database/load_balancing/connection_proxy.rb:61:in `block (2 levels) in <class:ConnectionProxy>'
 lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
 lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
 lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
 lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
 lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
 lib/gitlab/database/load_balancing/connection_proxy.rb:71:in `transaction'
 lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
 lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
 lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
 lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
 lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
 lib/gitlab/database/load_balancing/connection_proxy.rb:71:in `transaction'
 app/services/merge_requests/merge_service.rb:106:in `update_merge_sha_metadata'
 app/services/merge_requests/merge_service.rb:95:in `commit'
 app/services/merge_requests/merge_service.rb:33:in `block in execute'
 app/models/merge_request.rb:1510:in `in_locked_state'
 app/services/merge_requests/merge_service.rb:32:in `execute'
 app/services/merge_requests/merge_service.rb:20:in `execute'
 app/workers/merge_worker.rb:29:in `perform'
 lib/gitlab/database/load_balancing/sidekiq_server_middleware.rb:26:in `call'
 lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executed.rb:17:in `perform'
 lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:58:in `perform'
 lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'
 lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'
 lib/gitlab/sidekiq_middleware/worker_context/server.rb:19:in `block in call'
 lib/gitlab/application_context.rb:113:in `block in use'
 lib/gitlab/application_context.rb:113:in `use'
 lib/gitlab/application_context.rb:54:in `with_context'
 lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `call'
 lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'
 lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'
 lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `block in call'
 lib/gitlab/database/query_analyzer.rb:37:in `within'

Output of checks

This happened on GitLab.com - GitLab Enterprise Edition 15.5.0-pre 0a1d315d

Possible fixes

Edited by Ben Prescott_