[backport-1.4] Fix: Update BusyBox image tag to 1.36.0 for nfs-ganesha test jobs

What does this MR do and why?

This MR is a backport of !5536 (merged)

closes #2914 (closed)

Background:

The nfs-ganesha test jobs (write-data-from-pod-1, write-data-from-pod-2, read-data-from-pod-3) were failing in the pipeline due to ErrImagePull errors. Investigation revealed that the previously used image tag busybox:1.37.0 fails containerd’s strict size validation on cluster nodes, most likely because of a corrupted container image blob on some intermediate registry mirror, ultimately causing jobs not to start.

Changes:

Updated BusyBox image from 1.37.0 → 1.36.0 in all nfs-ganesha Job manifests.

This ensures the jobs can pull the image successfully on Sylva cluster nodes on the CI platforms impacted by the blob corruption.

In the mentioned failed pipeline:-

https://gitlab.com/sylva-projects/sylva-core/-/pipelines/2046588669

https://gitlab.com/sylva-projects/sylva-core/-/jobs/11061482130#L2862

Jobs-summary reporting failed jobs of nfs-ganesha:-

NAMESPACE                NAME                                        STATUS     COMPLETIONS   DURATION   AGE     CONTAINERS                             IMAGES                                                                                   SELECTOR
nfs-ganesha              read-data-from-pod-3                        Failed     0/1           12m        12m     test-container                         busybox:1.37.0                                                                           batch.kubernetes.io/controller-uid=5e8b6066-4167-4a6d-b7b5-af502c1391a2
nfs-ganesha              write-data-from-pod-1                       Failed     0/1           12m        12m     test-container                         busybox:1.37.0                                                                           batch.kubernetes.io/controller-uid=37e36837-b88e-47ff-aa62-1b8b8bb65735
nfs-ganesha              write-data-from-pod-2                       Failed     0/1           12m        12m     test-container                         busybox:1.37.0                                                                           batch.kubernetes.io/controller-uid=982b3c93-d0bb-4188-99b8-1af3a0d9a4f0

In the events it is showing error related to pre-condition:-

2025-09-18T07:29:13Z	2025-09-18T07:32:20Z	kubelet-mgmt-2046588669-rke2-capo-md0-plln9-dw6g7	Pod	nfs-ganesha	read-data-from-pod-3-w5wtq	5	Failed	Error: ErrImagePull
2025-09-18T07:29:14Z	2025-09-18T07:32:20Z	kubelet-mgmt-2046588669-rke2-capo-md0-plln9-z2555	Pod	nfs-ganesha	write-data-from-pod-1-smdpd	5	Failed	"Failed to pull image ""busybox:1.37.0"": rpc error: code = FailedPrecondition desc = failed to pull and unpack image ""docker.io/library/busybox:1.37.0"": failed commit on ref ""index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e"": ""index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e"" failed size validation: 9535 != 9534: failed precondition"
2025-09-18T07:29:14Z	2025-09-18T07:32:20Z	kubelet-mgmt-2046588669-rke2-capo-md0-plln9-z2555	Pod	nfs-ganesha	write-data-from-pod-1-smdpd	5	Failed	Error: ErrImagePull
2025-09-18T07:29:13Z	2025-09-18T07:32:22Z	kubelet-mgmt-2046588669-rke2-capo-md0-plln9-vqvsm	Pod	nfs-ganesha	write-data-from-pod-2-l46mg	5	Failed	"Failed to pull image ""busybox:1.37.0"": rpc error: code = FailedPrecondition desc = failed to pull and unpack image ""docker.io/library/busybox:1.37.0"": failed commit on ref ""index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e"": ""index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e"" failed size validation: 9535 != 9534: failed precondition"
2025-09-18T07:29:13Z	2025-09-18T07:32:22Z	kubelet-mgmt-2046588669-rke2-capo-md0-plln9-vqvsm	Pod	nfs-ganesha	write-data-from-pod-2-l46mg	5	Failed	Error: ErrImagePull

I tried pulling image manual on the node, giving me below error:-

management-cluster-cp-bcd91aed86-8grn9:/home/node-admin # crictl pull busybox:1.37.0
E0919 03:31:49.522102   14042 log.go:32] "PullImage from image service failed" err="rpc error: code = FailedPrecondition desc = failed to pull and unpack image \"docker.io/library/busybox:1.37.0\": failed commit on ref \"index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e\": \"index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e\" failed size validation: 9535 != 9534: failed precondition" image="busybox:1.37.0"
FATA[0001] pulling image: rpc error: code = FailedPrecondition desc = failed to pull and unpack image "docker.io/library/busybox:1.37.0": failed commit on ref "index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e": "index-sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e" failed size validation: 9535 != 9534: failed precondition

After Updating the image tag. now job is completed:-

NAMESPACE             NAME                                        STATUS     COMPLETIONS   DURATION   AGE     CONTAINERS              IMAGES                                                                               SELECTOR
nfs-ganesha           read-data-from-pod-3                        Complete   1/1           13s        3m54s   test-container          busybox:1.36.0                                                                         batch.kubernetes.io/controller-uid=78d0d0c1-e6b7-448b-be03-74f7e7054272
nfs-ganesha           write-data-from-pod-1                       Complete   1/1           8s         3m51s   test-container          busybox:1.36.0                                                                         batch.kubernetes.io/controller-uid=8b8048ff-5568-47e7-9277-2a62c1e06087
nfs-ganesha           write-data-from-pod-2                       Complete   1/1           4s         3m45s   test-container          busybox:1.36.0                                                                         batch.kubernetes.io/controller-uid=18d62c26-ac13-4149-81a2-3560078ce331

Related reference(s)

Test coverage

CI configuration

Below you can choose test deployment variants to run in this MR's CI.

Click to open to CI configuration

Legend:

Icon Meaning Available values
☁️ Infra Provider capd, capo, capm3
🚀 Bootstrap Provider kubeadm (alias kadm), rke2, okd, ck8s
🐧 Node OS ubuntu, suse, na, leapmicro
🛠️ Deployment Options light-deploy, dev-sources, ha, misc, maxsurge-0, logging, no-logging, openbao
🎬 Pipeline Scenarios Available scenario list and description
  • ☁️ capo 🚀 rke2 🛠️ ha,misc 🐧 ubuntu

Global config for deployment pipelines

  • autorun pipelines
  • allow failure on pipelines
  • record sylvactl events

Notes:

  • Enabling autorun will make deployment pipelines to be run automatically without human interaction
  • Disabling allow failure will make deployment pipelines mandatory for pipeline success.
  • if both autorun and allow failure are disabled, deployment pipelines will need manual triggering but will be blocking the pipeline

Be aware: after configuration change, pipeline is not triggered automatically. Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.

Merge request reports

Loading