Use short dns name to reach neuvector-svc-controller-api in neuvector-assign-fedamin-role job
Summary
On RKE2 CAPO deployment with neuvector unit enabled, sometime, the deployment fails. The kustomization/neuvector-assign-fedadmin-role never reconciles.
Steps to reproduce
Deploy an RKE2/CAPO with values like:
cluster:
capi_providers:
infra_provider: capo
bootstrap_provider: cabpr
capo:
os_image_selector:
os: ubuntu
hardened: false
ssh_key_name: xxxx
network_id: xxxxx
flavor_name: m1.xlarge
rootVolume:
diskSize: 100
volumeType: ceph_ssd
machine_deployments:
md0:
replicas: 1
control_plane_replicas: 3
units:
harbor:
enabled: true
monitoring:
enabled: false
neuvector:
enabled: true
openstack:
external_network_id: xxxxx
storageClass:
name: "cinder-ceph-ssd"
type: "ceph_ssd"
registry_mirrors: ......
What is the current bug behavior?
Kustomization/neuvector-assign-fedadmin-role never reconciles.
A name resolution problem occurs in the corresponding job.
For an unknown reason, the resolv.conf automatically set in the POD inherit from the nova.local (Openstack: CAPO) domain, which leads to resolution error while curl tries to resolv neuvector-svc-controller-api.neuvector.svc.cluster.local. However the nslookup command is able to solve the same name.
debug-alex:/$ cat /etc/resolv.conf
search neuvector.svc.cluster.local svc.cluster.local cluster.local nova.local
nameserver 100.73.0.10
options ndots:5
debug-alex:/$
What is the expected correct behavior?
Kustomization should reconcile
Relevant logs and/or screenshots
Possible fixes
Use the short dns name: neuvector-svc-controller-api as kube-job pod and targeted api are in the same namespace