Add a scrapeConfig for stackdriver metrics using GCE discovery
Split from #2680 (closed)
As part of our goal to scrape the same Prometheus Jobs that the VMs do from the Prometheus remote agents. On this particular issue will be focus in on the stackdriver metrics
Original configuration in chef for the job, (see chef role https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/blob/master/roles/gprd-infra-prometheus-server.json)
"stackdriver": {
"static_configs": [
{
"labels": {
"environment": "gprd",
"stage": "main",
"shard": "default",
"tier": "inf",
"type": "monitoring"
},
"targets": [
"cloudsql.googleapis.com/database/cpu",
"cloudsql.googleapis.com/database/disk",
"cloudsql.googleapis.com/database/instance_state",
"cloudsql.googleapis.com/database/memory",
"cloudsql.googleapis.com/database/postgresql/transaction_count",
"cloudsql.googleapis.com/database/state",
"cloudsql.googleapis.com/database/up",
"compute.googleapis.com/firewall",
"compute.googleapis.com/instance/disk/throttled_read_",
"compute.googleapis.com/instance/disk/throttled_write_",
"compute.googleapis.com/instance/integrity",
"compute.googleapis.com/instance/uptime",
"compute.googleapis.com/mirroring",
"compute.googleapis.com/nat",
"container.googleapis.com",
"file.googleapis.com",
"loadbalancing.googleapis.com/https/backend_latencies",
"loadbalancing.googleapis.com/https/backend_request_",
"loadbalancing.googleapis.com/https/backend_response_",
"loadbalancing.googleapis.com/https/request_",
"loadbalancing.googleapis.com/https/response_",
"loadbalancing.googleapis.com/l3/external/egress_",
"loadbalancing.googleapis.com/l3/external/ingress_",
"loadbalancing.googleapis.com/l3/internal/egress_",
"loadbalancing.googleapis.com/l3/internal/ingress_",
"logging.googleapis.com",
"monitoring.googleapis.com",
"networking.googleapis.com/vm_flow/egress_bytes_count",
"networking.googleapis.com/vm_flow/ingress_bytes_count",
"pubsub.googleapis.com/topic/byte_cost",
"pubsub.googleapis.com/topic/send_message_operation_count",
"pubsub.googleapis.com/subscription",
"router.googleapis.com/nat",
"storage.googleapis.com",
"vpn.googleapis.com"
]
}
],
"scrape_interval": "60s",
"scrape_timeout": "45s",
"relabel_configs": [
{
"source_labels": [
"__address__"
],
"target_label": "__param_collect"
},
{
"source_labels": [
"__address__"
],
"target_label": "metric_prefix"
},
{
"replacement": "sd-exporter-01-inf-gprd.c.gitlab-production.internal",
"target_label": "fqdn"
},
{
"replacement": "sd-exporter-01-inf-gprd.c.gitlab-production.internal:9255",
"target_label": "instance"
},
{
"target_label": "__address__",
"replacement": "sd-exporter-01-inf-gprd.c.gitlab-production.internal:9255"
}
]
},
This issue will be about add the relevant ScrapeConfig
using the GCE discovery to scrape the stackdriver exporter metrics across all VM's that aren't kubernetes. We might need to ensure that there is a GCE label that covers properly this VMs
At the moment, there is not a clear place to write this scrape configuration, I will follow up with a ticket that will allow us to give home to this and other similar configurations.
You can add the relevant configuration to the helm chart https://gitlab.com/gitlab-com/gl-infra/charts/-/tree/main/gitlab/prometheus-agent