Data on Longhorn disks are not cleaned during the Ironic cleaning phase

The disks data are not cleaned when a baremetal node is provisioned for the first time.
If a baremetal node is used on cluster foo and then finally deprovisioned and reprovisioned on cluster bar, it seems that the cluster foo data created on the Longhorn disk are still accessible when the baremetal is reprovisioned on cluster bar :

kubectl get bmh 
NAME                    STATE         CONSUMER                            ONLINE   ERROR   AGE
mgmt-29755967-r640-10   provisioned   mgmt-29755967-cp-46fc33d0d3-n4zdq   true             5h14m <<<<<<<<<<<<
mgmt-29755967-r640-12   provisioned   mgmt-29755967-cp-46fc33d0d3-bltsh   true             5h14m
mgmt-29755967-r640-13   provisioned   mgmt-29755967-md0-4fsg5-dff94       true             5h14m
mgmt-29755967-r640-9    provisioned   mgmt-29755967-cp-46fc33d0d3-gsrpv   true             5h14m


mgmt-29755967-r640-10:/var/longhorn/disks/disk_by-path_pci-0000:18:00.0-scsi-0:0:2:0/replicas # ls -l
total 4900
drwx------ 2 root root 4096 Aug  9 09:46 pvc-0007c3e7-c9e7-4602-8a3d-a219283e94d9-7f329272
drwx------ 2 root root 4096 Aug 29 13:31 pvc-000ab522-f484-4f7f-8c05-8c582f5d902c-a1f84a51
drwx------ 2 root root 4096 Jul 18 04:53 pvc-000bd7a2-d553-44dc-9baa-ac7bb18622f7-dd1062d5
drwx------ 2 root root 4096 Aug  9 04:50 pvc-001d7588-b3ad-4fb2-9192-31d18d5f25f5-e25d6aee
drwx------ 2 root root 4096 Aug  1 05:00 pvc-0023c8d8-292e-4f80-8d76-32d44a20bb5d-02c96fe0
drwx------ 2 root root 4096 Aug 23 01:52 pvc-003f672e-852a-4dcf-a548-8bae9676d3e9-3320cac4
drwx------ 2 root root 4096 Jul 24 05:02 pvc-007bc0c1-8d69-416f-bf06-11424cd5b307-4b172e50
drwx------ 2 root root 4096 Aug  9 04:50 pvc-00850aaf-906d-4c22-8517-a767374e6ae3-4ecb0908
drwx------ 2 root root 4096 Aug 29 13:20 pvc-008b1dab-6d42-4483-9d90-c12d60347208-487804f2
drwx------ 2 root root 4096 Jul 26 04:54 pvc-01053eaa-0f37-425a-9af4-33949ea8a11a-38fcabab
drwx------ 2 root root 4096 Jul 19 05:14 pvc-01d34358-ead9-4c53-938d-8c11018fee71-4131013e
drwx------ 2 root root 4096 Jul 23 04:53 pvc-01d446d6-a2ae-45bc-acac-c536c19b15ec-43e7460c
drwx------ 2 root root 4096 Sep  7 05:01 pvc-0211acd3-8778-4edd-ba29-dce1a60ba877-90fd9b1c
drwx------ 2 root root 4096 Aug 14 05:30 pvc-024cb1e7-d8b9-41cb-87dd-e211e723c6f3-4a9df778

However, there is a clean mechanism on Ironic, which seems to be enabled:

[conductor]
automated_clean = true
[deploy]
erase_devices_metadata_priority = 10
erase_devices_priority = 0
2024-09-10 04:21:13.233 1 DEBUG ironic.common.states [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Entering new state 'cleaning' in response to event 'provide' on_enter /usr/lib/python3.11/site-packages/ironic/common/states.py:366[00m
2024-09-10 04:21:13.259 1 INFO ironic.conductor.task_manager [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Node a4e3a31c-b9b3-4f6d-9122-bd88302d33aa moved to provision state "cleaning" from state "manageable"; target provision state is "available"[00m
2024-09-10 04:21:13.260 1 INFO eventlet.wsgi.server [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] ::ffff:100.100.0.1 "PUT /v1/nodes/a4e3a31c-b9b3-4f6d-9122-bd88302d33aa/states/provision HTTP/1.1" status: 202  len: 364 time: 0.1301506[00m
2024-09-10 04:21:13.260 1 DEBUG ironic.conductor.cleaning [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Starting automated cleaning for node a4e3a31c-b9b3-4f6d-9122-bd88302d33aa do_node_clean /usr/lib/python3.11/site-packages/ironic/conductor/cleaning.py:45[00m
2024-09-10 04:21:13.263 1 INFO eventlet.wsgi.server [None req-60599d8a-f5f6-433a-a780-0d3bc10de971 - - - - - -] ::ffff:100.100.0.1 "GET /v1/ HTTP/1.1" status: 200  len: 2879 time: 0.0014503[00m
2024-09-10 04:21:13.362 1 INFO eventlet.wsgi.server [None req-5a945bc0-10f6-4a11-bd51-28d0132a34b8 ironic - - - - -] ::ffff:100.100.0.1 "GET /v1/drivers HTTP/1.1" status: 200  len: 3609 time: 0.0968921[00m
2024-09-10 04:21:13.460 1 INFO eventlet.wsgi.server [None req-62aa17dc-0c9c-481c-bab3-76784a38e07b ironic - - - - -] ::ffff:100.100.0.1 "GET /v1/nodes/a4e3a31c-b9b3-4f6d-9122-bd88302d33aa HTTP/1.1" status: 200  len: 4204 time: 0.0953605[00m
2024-09-10 04:21:13.541 1 DEBUG ironic.common.states [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Exiting old state 'cleaning' in response to event 'done' on_exit /usr/lib/python3.11/site-packages/ironic/common/states.py:360[00m
2024-09-10 04:21:13.542 1 DEBUG ironic.common.states [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Entering new state 'available' in response to event 'done' on_enter /usr/lib/python3.11/site-packages/ironic/common/states.py:366[00m
2024-09-10 04:21:13.566 1 INFO eventlet.wsgi.server [None req-1c87e063-b846-44ba-890e-5f502bdfdde7 ironic - - - - -] ::ffff:100.100.0.1 "GET /v1/ports?fields=node_uuid&node_uuid=a4e3a31c-b9b3-4f6d-9122-bd88302d33aa HTTP/1.1" status: 200  len: 2984 time: 0.1052303[00m
2024-09-10 04:21:13.646 1 INFO ironic.conductor.task_manager [None req-c27ed036-931f-46ce-893d-69db63893c86 ironic - - - - -] Node a4e3a31c-b9b3-4f6d-9122-bd88302d33aa moved to provision state "available" from state "cleaning"; target provision state is "None"[00m

Docs :

  • https://static.opendev.org/docs/ironic/6.2.4/deploy/cleaning.html#enabling-automated-cleaning \
  • https://taikun.cloud/ocp-admin-guide/node-cleaning/ \
  • https://github.com/metal3-io/baremetal-operator/issues/626 \
Edited Sep 10, 2024 by Remi Le Trocquer
Assignee Loading
Time tracking Loading