Should prometheus kube service discovery only scrape ready pods?
Prometheus jobs that use kubernetes service discovery will scrape pods regardless of whether or not they are ready. This can lead to situations in which scrape failures occur at startup and shutdown: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/9106#note_342341798
This could be mitigated by only adding targets to service discovery when they are ready. This could be done using a discovery label prometheus exposes: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config (__meta_kubernetes_pod_ready).
Of course pod readiness doesn't necessarily imply metrics readiness, but there should be some correlation there.
Comes from a discussion with @mwasilewski-gitlab and @bjk-gitlab, but we think it needs a broad ping: @gitlab-com/gl-infra