When QA is failing, we get notified too late
Unless a release-manager is actively monitoring pipelines (which we want to lean away from anyways), failing QA is something we do not immediately notice. Since QA failures result in retries, QA jobs take longer, and then on top of this, we've confiured our pipelines to retry QA 3 times. So slower tests due to failures, plus pipeline retries, can take a 20 minute QA pipeline, quickly into a situation where we don't know anything was wrong until we receive a notification that QA failure multiple hours afterwards.
Consider adding some sort of notification that alerts us to the first QA failure, that way the release manager can start becoming involved and start watching subsequent failures and engage in help from Quality as needed. Doing so should reduce the time it takes for us to discover failures which should reduce the time it takes for us to resolve problems.