in Workload Cluster, one MD openstackmachine no longer has his primary vNIC
My management cluster controls heigth WCs. My 'free5g6-cluster' is fitted with one CP and two MDs, allocated 19 days ago. The first MD openstack machine lost his primary interface (cf ens3 on opensuse-15-5-plain-rke2-1-28-8) on 'provider-network-case-rfp-orch-test-oam-extended' network
Capo tries to reconcilate but does not properly handle the error in "openstackmachine_controller.go:473"
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ k get clusters -A |grep 5g6
wc-free5g6 free5g6-cluster Provisioned 19d
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ k get openstackmachine -n wc-free5g6
NAME CLUSTER INSTANCESTATE READY PROVIDERID MACHINE AGE
free5g6-cluster-cp-760e5049a3-4h5zt free5g6-cluster ACTIVE true openstack:///ddd658b4-4a36-47e2-835f-3ade9638a77e free5g6-cluster-control-plane-4h4j8 19d
free5g6-cluster-md-md0-7f29028a04-2khtx free5g6-cluster ACTIVE true openstack:///db857449-7ef9-461c-b278-cd501402f38b free5g6-cluster-md0-2d2tg-hqgpz 19d
free5g6-cluster-md-md0-7f29028a04-bth8k free5g6-cluster ACTIVE true openstack:///7454172e-e4fb-41e8-ba3f-78d36b4d69c5 free5g6-cluster-md0-2d2tg-nclqx 19d
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ openstack server list |grep free5g6-cluster-md-md0-7f29028a04
| 7454172e-e4fb-41e8-ba3f-78d36b4d69c5 | free5g6-cluster-md-md0-7f29028a04-bth8k | ACTIVE | n2-net=192.168.2.123; n3-net=192.168.3.19; n4-net=192.168.4.35; n6-net=192.168.6.138; provider-network-case-rfp-orch-test-oam-extended=172.20.143.40 | N/A (booted from volume) | B1.xlarge |
| db857449-7ef9-461c-b278-cd501402f38b | free5g6-cluster-md-md0-7f29028a04-2khtx | ACTIVE | n2-net=192.168.2.216; n3-net=192.168.3.69; n4-net=192.168.4.33; n6-net=192.168.6.95 | N/A (booted from volume) | B1.xlarge |
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ openstack server show free5g6-cluster-md-md0-7f29028a04-2khtx |grep -i creat
| created | 2024-04-24T13:32:10Z
when accessing to the problematic VM, the ENS3 vNIC no longer have valid IPv4 address
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ !ssh
ssh -i ~/private/DALAnsibleKP2.pem node-admin@172.20.143.40
Last login: Mon May 13 14:04:22 2024 from 172.20.99.197
Have a lot of fun...
node-admin@free5g6-cluster-md-md0-7f29028a04-bth8k:~> ssh -i /tmp/DALAnsibleKP2.pem node-admin@192.168.4.33
Last login: Mon May 13 14:04:24 2024 from 192.168.4.35
Have a lot of fun...
node-admin@free5g6-cluster-md-md0-7f29028a04-2khtx:~> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:33:80:3c brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet6 fe80::f816:3eff:fe33:803c/64 scope link
valid_lft forever preferred_lft forever
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:4e:55:c7 brd ff:ff:ff:ff:ff:ff
altname enp0s4
inet 192.168.2.216/24 brd 192.168.2.255 scope global ens4
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe4e:55c7/64 scope link
valid_lft forever preferred_lft forever
Then logs on CAPO service are reporting 'VM does not exist' but the VM exists, can be properly rebboted at Nova level... STRANGE: sometimes reporting about another K8S namespace="wc-rim5" /!\ any problem about NS context???
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ k logs -n capo-system capo-controller-manager-6469cb4d47-qgrbd | grep free5g6-cluster-md-md0-7f29028a04-2khtx
I0513 01:29:39.515530 1 openstackmachine_controller.go:329] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="wc-free5g6/free5g6-cluster-md-md0-7f29028a04-2khtx" namespace="wc-free5g6" name="free5g6-cluster-md-md0-7f29028a04-2khtx" reconcileID="1e525794-d458-4860-b3e4-1ad9b769d66c" openStackMachine="free5g6-cluster-md-md0-7f29028a04-2khtx" machine="free5g6-cluster-md0-2d2tg-hqgpz" cluster="free5g6-cluster" openStackCluster="free5g6-cluster"
°°°
I0513 08:03:46.026107 1 openstackmachine_controller.go:473] "Machine does not exist, creating Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="wc-rim5/rim5-cluster-md-md0-2fd17ee6b9-qd9kl" namespace="wc-rim5" reconcileID="b5aca506-9b68-4bae-af47-3666f3de3784" openStackMachine="rim5-cluster-md-md0-2fd17ee6b9-qd9kl" machine="rim5-cluster-md0-x2q8n-85dzt" cluster="rim5-cluster" openStackCluster="rim5-cluster" name="free5g6-cluster-md-md0-7f29028a04-2khtx"
> controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="wc-free5g6/free5g6-cluster-md-md0-7f29028a04-2khtx" namespace="wc-free5g6" name="free5g6-cluster-md-md0-7f29028a04-2khtx" reconcileID="239293a3-36c3-4e84-b164-092611696a20"
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ k -n capo-system get pods capo-controller-manager-6469cb4d47-qgrbd -o yaml |grep image:
image: registry.gitlab.com/sylva-projects/sylva-elements/container-images/sandbox-registry/capi-openstack-controller-amd64:0.9.0-75ffe73f
image: registry.gitlab.com/sylva-projects/sylva-elements/container-images/sandbox-registry/capi-openstack-controller-amd64:0.9.0-75ffe73f
ubuntu@rfp-test-bootstrap:~/my-deployment/sylva-core$ k describe openstackmachine -n wc-free5g6 free5g6-cluster-md-md0-7f29028a04-2khtx
Name: free5g6-cluster-md-md0-7f29028a04-2khtx
Namespace: wc-free5g6
Labels: cluster.x-k8s.io/cluster-name=free5g6-cluster
cluster.x-k8s.io/deployment-name=free5g6-cluster-md0
cluster.x-k8s.io/set-name=free5g6-cluster-md0-2d2tg
machine-template-hash=1390215715-2d2tg
Annotations: cluster.x-k8s.io/cloned-from-groupkind: OpenStackMachineTemplate.infrastructure.cluster.x-k8s.io
cluster.x-k8s.io/cloned-from-name: free5g6-cluster-md-md0-7f29028a04
API Version: infrastructure.cluster.x-k8s.io/v1alpha8
Kind: OpenStackMachine
Metadata:
Creation Timestamp: 2024-04-24T13:31:36Z
Finalizers:
openstackmachine.infrastructure.cluster.x-k8s.io
Generation: 2
Owner References:
API Version: cluster.x-k8s.io/v1beta1
Block Owner Deletion: true
Controller: true
Kind: Machine
Name: free5g6-cluster-md0-2d2tg-hqgpz
UID: 6eb33d90-44d7-431c-b200-41135775ba90
Resource Version: 91722870
UID: 57f66399-698d-427f-8d2d-5181e1167d46
Spec:
Cloud Name: capo_cloud
Flavor: B1.xlarge
Identity Ref:
Kind: Secret
Name: free5g6-cluster-capo-cloud-config
Image UUID: e54f02c2-fbc3-4914-be6a-d8c9231a803e
Instance ID: db857449-7ef9-461c-b278-cd501402f38b
Ports:
Description: primary
Network:
Id: bd9fe5ed-a260-4879-9014-902cf455d40d
Profile:
Description: n2
Network:
Id: f7bdfa79-b041-4ada-82bb-798c0959c124
Profile:
Vnic Type: normal