Reinstate custom CA support after fixing production incident

Summary

We need to reinstate the custom Certificate Authority (CA) support that was implemented in !134 (merged) but had to be reverted due to production incident gitlab-com/gl-infra/production#20741 (closed).

Background

The original feature was requested in #23 (closed) to support custom certificate authorities for setups where Gitaly server is configured with TLS support behind a hostname signed by a custom certificate authority.

The implementation went through several iterations:

Production Incident Details

The incident occurred after the chart deployment pipeline (started ~15:07 UTC, finished ~16:05 UTC) and manifested as TLS certificate verification failures:

Post "https://internal-gateway.gprd.gitlab.net:11443/api/v4/internal/search/zoekt/a7b9bcb1-53fe-404b-aa97-2bdb0c0df13b/heartbeat": tls: failed to verify certificate: x509: certificate signed by unknown authority

Key observations:

  • Errors started appearing around 15:15 UTC in Zoekt indexer logs
  • The failure indicates the signing CA was not recognized for the internal gateway service
  • The necessary CA certificates were apparently not populated correctly
  • This suggests the custom CA implementation in MR !134 (merged) may have interfered with the default CA bundle or certificate handling

Root Cause Analysis Needed

Before reinstating, we need to:

  1. Analyze why the custom CA implementation broke internal certificate verification

    • Did it override the default CA bundle?
    • Did it prevent system CAs from being loaded?
    • Was there a volume mount conflict?
  2. Identify the specific code in MR !134 (merged) that caused the internal gateway certificate verification to fail

  3. Develop a fix that:

    • Adds custom CAs without replacing the system CA bundle
    • Ensures internal GitLab.com certificates continue to work
    • Maintains backward compatibility

Acceptance Criteria

  • Root cause of the certificate verification failure is identified and documented
  • A solution is developed that adds custom CAs while preserving system CAs
  • Custom CA support is reinstated with the fix applied
  • Testing is performed to ensure:
    • Custom CA functionality works as intended
    • Internal GitLab.com certificate verification continues to work
    • No regression in existing certificate handling
  • The feature works as described in the original issue #23 (closed)

Customer Impact

This feature is needed by customers using:

  • Self-signed certificates for Gitaly servers
  • Custom certificate authorities in their infrastructure
  • TLS-enabled Gitaly setups behind custom-signed hostnames

Without this feature, customers cannot use Zoekt with their secure Gitaly configurations.

Related Issues/MRs

Edited by Dmitry Gruzd