Fluentd cannot ship logs to management cluster

Summary

Deploying a workload cluster with logging enabled I've noticed following issue into fluend pod logs:


2024-08-23 13:47:42 +0000 [warn]: #0 [clusterflow:cattle-logging-system:all-logs:clusteroutput:cattle-logging-system:loki] failed to flush the buffer. retry_times=7 next_retry_time=2024-08-23 13:49:37 +0000 chunk="62059fb40989708e6fe433838efa70d2" error_class=SocketError error="Failed to open TCP connection to loki.sylva:443 (getaddrinfo: Name does not resolve)"
  2024-08-23 13:47:42 +0000 [warn]: #0 suppressed same stacktrace
2024-08-23 13:49:37 +0000 [warn]: #0 [clusterflow:cattle-logging-system:all-logs:clusteroutput:cattle-logging-system:loki] failed to flush the buffer. retry_times=8 next_retry_time=2024-08-23 13:54:14 +0000 chunk="62059fb40989708e6fe433838efa70d2" error_class=SocketError error="Failed to open TCP connection to loki.sylva:443 (getaddrinfo: Name does not resolve)"
  2024-08-23 13:49:37 +0000 [warn]: #0 suppressed same stacktrace

Logs are not send to management cluster and the problem seems to be related to dns, loki.sylvawhich is not correctly resolved.

In order to fix it I've change the configuration of coredns by editing rke2-coredns-rke2-coredns configmap and add correct ip for management cluster ingress similar like we did in coredns unit.

cc: @mihai.zaharia @feleouet @baburciu

Fluentd cannot ship logs to management cluster

Summary

related references

Details