Don't record request duration for failing requests

Coming from gitlab-org/gitlab!62091 (comment 580655929) and the discussion in the team call:

We currently use two different metrics for measuring request apdex:, that have a slightly different implementation:

Gitlab::Metrics::RequestMiddleware: Used for SLIs

This implementation tracks the duration of all requests that did not raise, so it includes "manually" rendered 5xx errors.

The scoring for service availability looks like this:

Fast Slow

Success 2/2 1/2

Handled error 1/2 0/2

Unhandled server error 0/1 0/1
Gitlab::Metrics::Transaction: Used for stage group error budgets.

This implementation tracks the duration of all requests, regardless of the status.

The scoring for the error budget looks like this:

Fast Slow

Success 2/2 1/2

Error 1/2 0/2

Proposal

In the short term (without changing these metrics), we want to not take make an apdex measurement for failing requests, where a failing request is anything resulting in a 5xx status code. This is the current situation. If we stopped measuring durations for 4xx requests in this iteration, we'd be taking out a bunch of very fast requests from the apdex. This could trigger alerts and we'd need to tread carefully. A better way to do this would be to introduce new metrics, and switch our SLIs over to those. (#1099 (closed))

So for error budgets and availability want to have the following scoring:

	Fast	Slow
Success	2/2	1/2
Error	0/1	0/1

Edited May 20, 2021 by Bob Van Landuyt