Fix X-Gitaly-Correlation-Id propagation to Gitaly gRPC calls

Context

Contributes to Eliminate Git-Related Infrastructure Failures i... (gitlab-org/quality/analytics/team#129 - closed).

Follow-up of Add the gitaly_correlation_id to most log entries (!203648 - merged) and Update Workhorse to log X-Gitaly-Correlation-Id... (!203087 - merged).

Runner sets X-Gitaly-Correlation-Id headers in Git requests to enable tracing of job operations from Runner through to Gitaly. However, workhorse was logging these correlation IDs but not passing them to Gitaly gRPC calls, breaking Git operation traceability.

From the Git traceability diagram, Runner configures Git to send correlation IDs that should flow through the entire stack for debugging purposes.

What's in this MR?

This MR enables end-to-end Git operation tracing by passing X-Gitaly-Correlation-Id from HTTP requests to Gitaly gRPC calls:

  1. Extract correlation ID in git handlers: Modified handleGetInfoRefs in info-refs.go to extract the X-Gitaly-Correlation-Id header and store it in the request context
  2. Manual gRPC metadata injection: Modified withOutgoingMetadata in gitaly.go to extract the correlation ID from context and add it directly to gRPC metadata sent to Gitaly

The approach bypasses the existing labkit correlation interceptor (which was generating its own correlation IDs) and manually controls which correlation ID gets sent to Gitaly.

Technical Details

Before this change:

  • Workhorse logged: "gitaly_correlation_id":"runner-provided-id"
  • Gitaly logged: "correlation_id":"different-generated-id"

After this change:

  • Workhorse logged: "gitaly_correlation_id":"runner-provided-id"
  • Gitaly logged: "correlation_id":"runner-provided-id" (same ID)

Steps to reproduce locally

  1. Start GDK with a public project
  2. Test correlation ID propagation:
    CORRELATION_ID="test-$(date +%s)"
    curl -H "X-Gitaly-Correlation-Id: $CORRELATION_ID" \
         "http://localhost:3000/your-project.git/info/refs?service=git-upload-pack" > /dev/null
    
    # Check workhorse logs
    timeout 3 gdk tail gitlab-workhorse | grep "$CORRELATION_ID"
    
    # Check gitaly logs  
    timeout 3 gdk tail gitaly | grep "$CORRELATION_ID"
  3. Both logs should show the same correlation ID

Below is an example on my local machine (screenshot, so that we can see the correlation ID in red):

Screenshot_2025-09-17_at_15.26.41

Edited by David Dieulivol

Merge request reports

Loading