`qa-master` notified failure due to timeout but `gitlab-qa` pipeline passed

The qa-master received failure notification, but the gitlab-qa pipeline actually passed. This is because schedule:package-and-qa job timed out, which could be due to either omnibus-gitlab-mirror or gitlab-qa pipeline running longer than expected.

The easiest quick fix is to increase the time out for the schedule:package-and-qa job based on the average duration of successful jobs that we have in Periscope, but it would add to the overall pipeline duration.

If we were to keep the same timeout, I'm not sure how we can determine if downstream pipelines have completed and passed. We could make the notify job poll downstream pipelines, but it would still be uncertain because the notify job can happen before the downstream pipelines have completed.

/cc @gl-quality/eng-prod thoughts?

Reference from slack:

Todo:

Pass through TOP_UPSTREAM_SOURCE_REF to omnibus-gitlab-mirror gitlab-org/gitlab!22263 (merged)
Pass through TOP_UPSTREAM_SOURCE_REF to gitlab-qa + notify slack on omnibus-gitlab-mirror failure gitlab-org/omnibus-gitlab!3823 (merged)
Notify slack on gitlab-qa failure gitlab-org/gitlab-qa!361 (merged)
Clean up notification code in gitlab-org/gitlab gitlab-org/gitlab!22508 (merged)

Edited Jan 07, 2020 by Albert Salim