Enable Stackdriver tracing in Thanos
Thanos Store
-
Enable tracing sampling for -
Use workload identity to give thanos-store
pod access to writing traces in GCP-
Create google service account 👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3062 -
Give prometheus-sa
tracing permissions as shown in https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14255#note_718193860-
pre
,gstg
👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3103 / https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3104 -
ops
,org-ci
👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3105 -
gprd
👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3123
-
-
Thanos Query
As we saw in https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14255#note_718493473 we are still seeing some permission denied for thanos-query
. Right now thanos-query
has its own k8s service account as seen below:
steve@bastion-01-inf-ops.c.gitlab-ops.internal:~$ kubectl -n monitoring get po thanos-query-597b9975cd-zf27w -o json | jq '.spec.serviceAccount'
"thanos-query"
steve@bastion-01-inf-ops.c.gitlab-ops.internal:~$ kubectl -n monitoring get sa thanos-query -o json
{
"apiVersion": "v1",
"kind": "ServiceAccount",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "xxxxx"
},
"creationTimestamp": "2021-03-23T09:40:59Z",
"labels": {
"app.kubernetes.io/component": "query-layer",
"app.kubernetes.io/instance": "thanos-query",
"app.kubernetes.io/name": "thanos-query",
"app.kubernetes.io/version": "v0.20.1",
"tanka.dev/environment": "92677f0e9d5842e8a17289b4f8259fc07037b53f69fabc67"
},
"name": "thanos-query",
"namespace": "monitoring",
"resourceVersion": "317795172",
"uid": "09a5316b-38de-4804-8f74-28aa59a06daf"
},
"secrets": [
{
"name": "thanos-query-token-t659c"
}
]
}
As we can see, thanos-query
k8s service account doesn't have any workload identity attached to it with iam.gke.io/gcp-service-account
annotation like the prometheus
service account.
-
Create a thanos-query-sa
GCP service account inside of thegitlab-ops
GCP project👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3114 -
Create roles/iam.workloadIdentityUser
to for thanos query following workload identity👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3114 -
Give thanos-query-sa
roles/cloudtrace.agent
and role inside ofgitlab-ops
GCP project👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3111 -
Update annotation for thanos-query
k8s service account to use workload identity👉 gitlab-com/gl-infra/k8s-workloads/tanka-deployments!220 (merged)
Thanos Query frontend
In https://nonprod-log.gitlab.net/goto/db8b12638cbfdd6e23db85f31eb2d83c we are seeing thanos-query-frontend
not able to upload traces, similar to what we had in thanos-query
-
Create a thanos-query-frontend-sa
GCP service account inside of thegitlab-ops
GCP project👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3117 -
Create roles/iam.workloadIdentityUser
to for thanos query following workload identity👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3117 -
Give thanos-query-sa
roles/cloudtrace.agent
and role inside ofgitlab-ops
GCP project👉 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/3118 -
Update annotation for thanos-query
k8s service account to use workload identity👉 gitlab-com/gl-infra/k8s-workloads/tanka-deployments!221 (merged)
Follow ups
Edited by Steve Xuereb