upgrade issues around RKE2 k8s metrics server

I observed the following while working on !3090 (closed), in job https://gitlab.com/sylva-projects/sylva-core/-/jobs/8340196863:

workload cluster namespace deletion was stuck on:

NAME              STATUS        AGE
rke2-capm3-virt   Terminating   137m
Namespace 'rke2-capm3-virt' deletion did not complete (.status below)
conditions:
  - lastTransitionTime: "2024-11-13T00:21:10Z"
    message: 'Discovery failed for some groups, 1 failing: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: stale GroupVersion discovery: metrics.k8s.io/v1beta1'
    reason: DiscoveryFailed
    status: "True"
    type: NamespaceDeletionDiscoveryFailure

this is due to the kube-system/rke2-metrics-server Service having no endpoint
this is because the selector of the service does not match the pod labels:
- selector is:
```
selector:
 app: rke2-metrics-server
 app.kubernetes.io/instance: rke2-metrics-server
 app.kubernetes.io/name: rke2-metrics-server
```
  the selector was last updated at 2024-11-12T23:06:26Z (rke2-metrics-server chart version rke2-metrics-server-3.12.003)
- labels of the only rke2-metrics-server pod:
```
labels:
  app: rke2-metrics-server
  pod-template-hash: 544c8c66fc
  release: rke2-metrics-server
```
  the app.kubernetes.io/instance and app.kubernetes.io/name labels are not here, so the selector can't match
  
  this pod is from the previous version of rke2-metrics-server (Deployment has the chart: rke2-metrics-server-2.11.100-build2023051513 label)
I observed that the upgrade of the rke2-metrics-server had failed -- in logs of kube-system/helm-install-rke2-metrics-server-276wj pod:

Release status is 'failed' and failure policy is 'abort', not 'reinstall'; waiting for operator intervention

Summary:

the Helm upgrade failed (I don't know the reason)
the rke2-metrics-server was updated with a new pod selector with labels that pods of previous release do not have
the rke2-metrics-server Deployment was not updated, and hence rke2-metrics-server pods still don't have the new labels

Possibilities for resolution:

understand why the Helm release status is failed
upstream fix (the change of selector could and should be done in a way that does not disrupt the service)
something to live-adjust the pod labels, or the service pod selector, to allow a smoother transition (Kyverno policy)