Use short dns name to reach neuvector-svc-controller-api in neuvector-assign-fedamin-role job

What does this MR do and why?

Change url used in neuvector-assign-fedamin-role job by a shorter one. As target svc is in the same namespace "neuvector" we can use only the svc name as url.

Context of bug

With a CAPO deployment, where nodes are spawned on an Openstack subnet not configured with DNS servers.

The neuvector-assign-fedamin-role job runs into an Alpine container, Alpine containers use musl-libc which behavior is slightly different from gnu standard glibc we have in debian (for example).

Neuvector POD /etc/resolv.conf config is:

search neuvector.svc.cluster.local svc.cluster.local cluster.local nova.local
nameserver 100.73.0.10
options ndots:5

In case of Openstack Network without DNS severs, we noticed the following DNS requests, when the kube-job (Alpine) tries to resolv neuvector-svc-controller-api.neuvector.svc.cluster.local, we did the same test with debian container (libc) for comparison:

DNS request type name Alpine (musl libc) Debian (gnu libc) Core DNS Response
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.neuvector.svc.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.svc.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.nova.local 📨 📨 refused
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 🚫 No error (but no ipv6)
A neuvector-svc-controller-api.neuvector.svc.cluster.local.neuvector.svc.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.svc.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.nova.local 📨 📨 refused
A neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 🚫 No error + ipv4

📨 : request sent 🚫 : request not sent

Musl libC resolver doesn't send further AAAA or A request after receiving Refused response, this leads to DNS resolution error. Gnu LibC resolver sent a last request considering the name as an FQDN, and it succeeds.

10 requests + 10 responses are required to solve the name.

With the shorter name "neuvector-svc-controller-api" only 2 requests are needed in our case:

DNS request type name Alpine (musl libc) Debian (gnu libc) Core DNS Response
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 📨 No error (but no ipv6)
A neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 📨 No error + ipv4

Another solution could be used, which consists to tune the ndot number per POD:

spec:
  dnsConfig:
    options:
      - name: ndots
        value: "4"

Result is identical to use a short name.

For information, in the following table, all DNS requests/responses sent/received when nodes are spawn on subnet with DNS servers

DNS request type name Alpine (musl libc) Debian (gnu libc) Core DNS Response External DNS response
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.neuvector.svc.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.svc.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.cluster.local 📨 📨 no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local.nova.local 📨 📨 forwarded no such name
AAAA neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 📨 No error (but no ipv6)
A neuvector-svc-controller-api.neuvector.svc.cluster.local.neuvector.svc.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.svc.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.cluster.local 📨 📨 no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local.nova.local 📨 📨 forwarded no such name
A neuvector-svc-controller-api.neuvector.svc.cluster.local 📨 📨 No error + ipv4

Test coverage

Manually tested on CAPO deployments:

  • nodes spawned on Subnet with DNS servers
  • nodes spawned on Subnet without DNS servers

CI configuration

Below you can choose test deployment variants to run in this MR's CI.

Click to open to CI configuration

Legend:

Icon Meaning Available values
☁️ Infra Provider capd, capo, capm3
🚀 Bootstrap Provider kubeadm (alias kadm), rke2, okd, ck8s
🐧 Node OS ubuntu, suse, na, leapmicro
🛠️ Deployment Options light-deploy, dev-sources, ha, misc, maxsurge-0, logging, no-logging, cilium
🎬 Pipeline Scenarios Available scenario list and description
🟢 Enabled units Any available units name, by default apply to management and workload cluster. Can be prefixed by mgmt: or wkld: to be applied only to a specific cluster type
🏗️ Target platform Can be used to select specific deployment environment (i.e real-bmh for capm3 )
  • 🎬 preview ☁️ capd 🚀 kadm 🐧 ubuntu

  • 🎬 preview ☁️ capo 🚀 rke2 🐧 suse

  • 🎬 preview ☁️ capm3 🚀 rke2 🐧 ubuntu

  • ☁️ capd 🚀 kadm 🛠️ light-deploy 🐧 ubuntu

  • ☁️ capd 🚀 rke2 🛠️ light-deploy 🐧 suse

  • ☁️ capo 🚀 rke2 🐧 suse

  • ☁️ capo 🚀 rke2 🐧 leapmicro

  • ☁️ capo 🚀 rke2 🐧 ubuntu 🟢 neuvector

  • ☁️ capo 🚀 kadm 🐧 ubuntu 🟢 neuvector

  • ☁️ capo 🚀 kadm 🐧 ubuntu 🟢 neuvector,mgmt:harbor

  • ☁️ capo 🚀 rke2 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capo 🚀 kadm 🎬 wkld-k8s-upgrade 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 rolling-update-no-wkld 🛠️ ha 🐧 suse

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade 🛠️ ha 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade-from-1.6.x 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🛠️ ha,misc,openbao🐧 suse

  • ☁️ capo 🚀 rke2 🐧 suse 🎬 upgrade-from-prev-tag

  • ☁️ capm3 🚀 rke2 🐧 suse 🟢 neuvector

  • ☁️ capm3 🚀 kadm 🐧 ubuntu 🟢 neuvector

  • ☁️ capm3 🚀 ck8s 🐧 ubuntu

  • ☁️ capm3 🚀 kadm 🎬 rolling-update-no-wkld 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 wkld-k8s-upgrade 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 upgrade-from-prev-release-branch 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 rke2 🛠️ misc,ha 🐧 suse

  • ☁️ capm3 🚀 rke2 🎬 sylva-upgrade 🛠️ ha,misc 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 ck8s 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capm3 🚀 rke2|okd 🎬 no-update 🐧 ubuntu|na

  • ☁️ capm3 🚀 rke2 🐧 suse 🎬 upgrade-from-release-1.5

  • ☁️ capm3 🚀 rke2 🐧 suse 🎬 upgrade-to-main

Global config for deployment pipelines

  • autorun pipelines
  • allow failure on pipelines
  • record sylvactl events

Notes:

  • Enabling autorun will make deployment pipelines to be run automatically without human interaction
  • Disabling allow failure will make deployment pipelines mandatory for pipeline success.
  • if both autorun and allow failure are disabled, deployment pipelines will need manual triggering but will be blocking the pipeline

Be aware: after configuration change, pipeline is not triggered automatically. Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.

Closes #3584 (closed)

Edited by Alexandre Seitz

Merge request reports

Loading