Product Planning Error Budget Investigation
Summary
Product Planning error budgets remain in the red. This issue is designed to investigate the primary contributors to budget spend. The current 7d budget is 99.90%. The target is 99.95%.
Contributing Factors
The top 5 contributing endpoints for Apdex-related issues are:
"json.meta.caller_id.keyword: Descending" | json.request_urgency.keyword: Descending | json.target_duration_s: Descending | Count | Operations over specified threshold (apdex) |
---|---|---|---|---|
GraphqlController#execute | low | 5 | 355190 | 6137 |
GET /api/:version/groups/:id/epics | low | 5 | 76559 | 2397 |
Groups::EpicsController#show | default | 1 | 142447 | 2256 |
Groups::EpicsController#index | default | 1 | 27199 | 1086 |
GET /api/:version/groups/:id/epics/:eventable_id/resource_label_events | low | 5 | 5855 | 508 |
Investigation
Create a discussion for each endpoint(s) being investigated and add findings and next steps to this section.
GraphqlController#execute
GET /api/:version/groups/:id/epics
Groups::EpicsController#show
Findings
- Mostly likely due to cache misses on group-level issue, MR and Epic counts in the sidebar. (see #367868 (comment 1026555132))
Proposal
- Create a GraphQL endpoint for retrieving group-level counts and update frontend to remove counts from the page-load.
- [Quick] Increase cache retention timings.
Groups::EpicsController#index
Findings
- Mostly likely same root cause as Groups::EpicsController#show. Cache misses for group-level sidebar counts.
- Incidence of slowness is higher than other group-level, so there may be other factors (see #367868 (comment 1026555132)).
Proposal
- Create a GraphQL endpoint for retrieving group-level counts and update frontend to remove counts from the page-load.
- [Quick] Increase cache retention timings.
GET /api/:version/groups/:id/epics/:eventable_id/resource_label_events
Edited by John Hope