Skip to content

Fix job duration counting

Tomasz Maczukin requested to merge fix-job-duration-counting into master

What does this MR do?

Fixes the way how Runner tracks the duration of a job.

Why was this MR needed?

We've recently added a mechanism to see a duration of job in debug list (!992 (merged)) and added a histogram metric that tracks job durations (!1025 (merged)). However there is one problem with the value of duration. It's calculated as a difference between current time and job's startedAt which is set after the execution of the job is started. While this would definitely show how long the job definition is being executed, it doesn't count the whole execution. For example, when Docker executor is used, Runner needs to pull images, spin up the containers, join them together, which may take a significant time for big images. This time would be not tracked by the new metric. While when talking about duration, from Runner's perspective, we're interested in the general time of handling of the job - since it was received until it succeeded or failed.

Another problem is that before startedAt is set, the value is 0, so duration shows values like duration=2562047h47m16.854775807s, which is annoying when using the /debug/jobs/list?v=2 URL. And if a job will fail in the preparation stage (e.g. Docker image chosen for job is not available), such duration is added to the histogram, which can be already seen on graphs:

Screenshot_2018-09-26_Grafana_-_CI

This MR fixes this problem. It renames the startedAt value to createdAt which is what we're generally interested in. Also the initialization of the value is moved to the moment, when Build is created. This ends with a proper duration tracked from the very beginning of the job lifetime.

Are there points in the code the reviewer needs to double check?

Does this MR meet the acceptance criteria?

  • Documentation created/updated
  • Added tests for this feature/bug
  • In case of conflicts with master - branch was rebased

What are the relevant issue numbers?

Merge request reports