Skip to content

GitLab Next

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
scalability
scalability
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 188
    • Issues 188
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 0
    • Merge Requests 0
  • Requirements
    • Requirements
    • List
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.com
  • GitLab Infrastructure Team
  • scalabilityscalability
  • Issues
  • #519

You need to sign in or sign up before continuing.
Closed
Open
Created Aug 03, 2020 by Andrew Newdigate@andrewn👾Maintainer

Add client GRPC logging for Gitaly Ruby calls

At present, there are several hosts in the GitLab.com Gitaly fleet with very bad SLIs, particularly for gitalyruby access.

An example of this type of issue is infrastructure#10953 (comment 389606586)

One of the big problems I'm finding in diagnosing this issue is that, from a metrics point of view, gitalyruby is fairly opaque.

We do have GRPC client metrics for communicating with gitalyruby, but not much more.

In investigating issues such as infrastructure#10953 (comment 389606586), it would be really helpful to be able to know if the error rates come from a single Gitalyruby process or all of the processes simultaneously. At present, it's not possible to know.

In other to investigate further we need either:

  1. Distributed Tracing enabled on GitLab.com: &210
    1. Gitalyruby is already instrumented for Distributed Tracing. This would allow us to understand which processes are affected
    2. I'm unsure of when this will be delivered
  2. Gitalyruby Request Logging
    1. Optionally configure GRPC client logger in Go, writes logs alongside the main Gitaly access logs
    2. Downside is more logging
    3. Upside: easy to do
  3. Additional logging metrics
    1. With 50+ Gitaly servers, we would be hard pressed to increase the cardinality on these metrics unfortunately

Proposal

I propose we implement option 2, adding the ability to enable client GRPC logging requests in Gitaly, to Gitalyruby:

clientside GRPC interceptor to GRPC calls from gitaly-go to gitaly-ruby

We should make sure that the logs include the child process id, correlation id and set gitaly-ruby as the type.

Edited Oct 08, 2020 by Rachel Nienaber
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None