Skip to content

Avoid idling in transaction when fetching source for merge requests

What does this MR do and why?

Previously when a merge request was created, the after_create Rails callback would fire the FetchSourceBranch Gitaly RPC (via ensure_merge_request_diff). However, this call is made inside a database transaction, which may have an idle in transaction timeout.

On GitLab.com, this timeout is set to 30 seconds, so any FetchSourceBranch calls taking longer than 30 seconds causes the database connection to be terminated quietly. When Rails attempts to make another database write (e.g. writing the metrics), the query fails with a 500 error after the RPC completed, usually because a subsequent query ran into an error.

Instead of failing after the idle in transaction, this commit will cause the query to be killed if the total Web request time expires (60 seconds or so). While this isn't ideal, this is better than the alternative because we should never hold open a database transaction for a long Gitaly call. Gitaly team has a few optimizations for FetchSourceBranch in the works, but allowing this RPC to run for up to 60 seconds during a Web request is better than the status quo.

Relates to #336657 (closed)

How to set up and validate locally

  1. Patch Gitaly to delay FetchSourceBranch by 35 seconds:
diff --git a/internal/gitaly/service/repository/fetch.go b/internal/gitaly/service/repository/fetch.go
index 6cb933fd0..6870d1f02 100644
--- a/internal/gitaly/service/repository/fetch.go
+++ b/internal/gitaly/service/repository/fetch.go
@@ -3,6 +3,7 @@ package repository
 import (
 	"context"
 	"errors"
+	"time"
 
 	"github.com/grpc-ecosystem/go-grpc-middleware/logging/logrus/ctxlogrus"
 	"gitlab.com/gitlab-org/gitaly/v14/internal/git"
@@ -21,6 +22,8 @@ func (s *server) FetchSourceBranch(ctx context.Context, req *gitalypb.FetchSourc
 		return nil, helper.ErrInvalidArgument(err)
 	}
 
+	time.Sleep(35 * time.Second)
+
 	targetRepo := s.localrepo(req.GetRepository())
 
 	sourceRepo, err := remoterepo.New(ctx, req.GetSourceRepository(), s.conns)
  1. Configure PostgreSQL idle_in_transaction_session_timeout to be 30000 (30 seconds) (https://www.postgresql.org/docs/12/runtime-config-client.html)
  2. Enable the merge_request_eager_fetch_ref feature flag:
    Feature.enable(:merge_request_eager_fetch_ref)
  3. Create a merge request for one project.

Without the feature flag, you get a 500 error (PG::UnableToSend: no connection to the server) since the next write fails.

With the feature flag, the merge request completes.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports