fluentbit requesting a large percentage of cpu resources on k8s nodes

As part of the investigation into https://gitlab.com/gitlab-com/gl-infra/mstaff/-/issues/359 I was looking into cpu utilization on the k8s nodes. ~~One thing that stood out were these outofcpu errors:~~ UPDATE: This is not the reason for the outofcpu errors.

source

This error means that the kube scheduler cannot find any node to schedule a pod and represents a failed attempt to add more capacity. The only reason this would happen is if we are unable to add new nodes, or if we have pods that being allocated more cpu than in the requests.

If it is the latter, fluentbit gke stands out for using more than 100% of requests.

We are already using fluentd for logging, which sends logs to pubsub. fluentbit is used by gke for sending logs to stackdriver, most of which gets forwarded to object storage for archival.

From the docs on log throughput

The dedicated Logging agent provides at least 100 KiB per second log throughput per node for system and workload logs. If a node is underutilized, then depending on the type of log load (for example, text or structured log entries, very few containers on the node or many containers), the dedicated logging agent might provide throughput as much as 500 KiB per second or more. Additionally, in clusters with GKE control plane version 1.23.13-gke.1000 or later, the Logging agent allows for throughput as high as 10 MiB per second on nodes that have at least 2 unused CPU cores. Be aware, however, that at higher throughputs, some logs may be lost.

our log rate per node averages around 500 KiB, and spikes as high as 2MiB.

The cpu requests for fluentbit is very low:

$ kubectl get pods --all-namespaces -l k8s-app=fluentbit-gke -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
CPU_REQUEST:.spec.containers[*].resources.requests.cpu,\
CPU_LIMIT:.spec.containers[*].resources.limits.cpu,\
MEMORY_REQUEST:.spec.containers[*].resources.requests.memory,\
MEMORY_LIMIT:.spec.containers[*].resources.limits.memory
NAMESPACE     NAME                  CPU_REQUEST   CPU_LIMIT   MEMORY_REQUEST     MEMORY_LIMIT
kube-system   fluentbit-gke-22bpf   50m,50m,5m    1           100Mi,100Mi,30Mi   250Mi,250Mi,30Mi
kube-system   fluentbit-gke-26bzx   50m,50m,5m    1           100Mi,100Mi,30Mi   250Mi,250Mi,30Mi
...

Should we look into exclusion filters for log ingestion?
Is fluentbit the reason for the outofcpu errors we are seeing for scheduling? Given we have only one of these pods scheduled per node I was thinking maybe not. It doesn't look like it is.
Can we bump the cpu requests to around 300m? this would mean deploying a custom fluentbit https://cloud.google.com/knowledge/kb/deploy-a-google-kubernetes-engine-custom-fluent-bit-000004839

Edited Sep 26, 2024 by John Jarvis