Staging alerting and SLA discussion

added staging label

added 1 deleted label

mentioned in issue on-call-handovers#118 (closed)

What team should be DRI for staging

From the application side of things, this should probably be the Delivery team in my opinion. Delivery can bring online the necessary Engineers to assist in fixing situations induced by an application failure. Ensure it is known though, that staging is used for more than testing the application prior to shipping it to production. Examples include the gitaly praefect and zfs file storage testing that has occurred in the past and also minor items inside of chef or terraform that may impact the entire environment. Delivery is not responsible for these types of infrastructure changes/testing. As such I would vote Infrastructure own the staging... infrastructure. Delivery would own, specifically the application.

What should be the SLA for staging?

GitLab depends on staging for quite a bit. It would be wise that an SLA is set, though I believe during weekends, we can lessen the burden this induces.

I agree. We had an initiative stall a year ago that would have "refreshed" the staging environment. And, I think it fell apart because there's wasn't formally defined ownership. I'm not sure what the priorirty is, but it's absolutely worth a discussion. I've placed an item for discussion on next weeks' DNA agenda.

mentioned in issue production#1429 (closed)

Given updates to make S2 incidents for staging deployments being blocked an other work by the delivery team, I'm thinking we can close this. @ahanselka and @amyphillips - would you agree?

(closing for now, but re-open if you disagree)

closed

added workflow-infraCancelled label

added workflow-infraDone label and removed workflow-infraCancelled label

Staging alerting and SLA discussion

Designs

Child items ...

Activity