logs not available in ELK for a number of indexes
Please note: if the incident relates to sensitive data, or is security related consider labeling this issue with security and mark it confidential.
Summary
A brief summary of what happened. Try to make it as executive-friendly as possible.
Service(s) affected : Team attribution : Minutes downtime or degradation :
Timeline
2019-07-12 - 14:00 UTC
- 14:00 UTC - Noticed some logs were not up to date in kibana
- 15:10 UTC - MR 5 to change to 30M docs per rollover
- 18:19 UTC - Kill -9'ed the stuck workhorse logs pubsubbeat
2019-07-13
- 06:00 UTC - Workhorse logs caught up
https://dashboards.gitlab.net/d/USVj3qHmk/logging?orgId=1&from=now-2d&to=now
affected indeces:
- rails
- workhorse
- (...)
Working notes - 2 MRs to attempt to make rollover API a little more active:
- https://ops.gitlab.net/gitlab-com/gl-infra/gitlab-restore/esc-tools/merge_requests/5 - change docs count to 30M from 150M
- https://ops.gitlab.net/gitlab-com/gl-infra/gitlab-restore/esc-tools/merge_requests/6 - add nginx logs to rollover api
Edited by 🤖 GitLab Bot 🤖