Skip to content

Log duration of ProcessBookkeepingService#execute

Dylan Griffith requested to merge log-duration-in-bulk-indexer into master

What does this MR do?

At present there is no easy way to correlate the time taken to the number of records being processed. We can see log statements when it starts, when it ends as well as the corresponding sidekiq logs but the logs containing the counts don't contain the duration so it would be really tricky to use correlation_id to match things up and figure out the duration.

This will make it easy to plot the bulk_execution_duration vs. records_count in Kibana and therefore allow us to make some predictions about whether or not we can scale up the records_count.

This related to a previous conversation in !28511 (comment 323151440) .

We'd probably like to add even more metadata to these log statements so we can understand DB time, CPU time etc but I wanted to make the smallest change first.

That being said I would love to know if there is a better way of orchestrating this method so that I can get a log statement with all the metadata I want. Particularly I just want a single log with the records_count then all other instrumentation diagnostics we get from normal sidekiq logs (like db_duration and others). Perhaps there may be some way to do this if my class can somehow call out and inject extra metadata into the Gitlab::SidekiqLogging::StructuredLogger but I couldn't find any way to do that looking at the code.

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Merge request reports