Unresolvable PodMonitor reconciliation loop

Summary

StackGres operator logs show a continuous PodMonitor reconciliation loop that is never resolved.

Current Behaviour

Log output from stackgres operator pod. Repeats every 1 min. See below

Steps to reproduce

Install StackGres operator from Helm chart 1.17.4

Look at stackgres-operator pod logs

Expected Behaviour

PodMonitor should be reconciled and there should be no repeated attempts to reconcile.

Possible Solution

??

Environment

  • StackGres version: Release Name: stackgres-operator StackGres Version: 1.17.4

  • Kubernetes version: Client Version: v1.34.1 Kustomize Version: v5.7.1 Server Version: v1.33.5-eks-3cfe0ce

  • Cloud provider or hardware configuration: AWS EKS

Relevant logs and/or screenshots

stackgres-operator pod log:

025-11-18 12:10:33,015 INFO  [io.st.op.conciliation] (SGConfig-ReconciliationLoop) SGConfig stackgres.stackgres-operator it's not up to date. Reconciling
2025-11-18 12:10:33,015 INFO  [io.st.op.conciliation] (SGConfig-ReconciliationLoop) Creating PodMonitor stackgres.stackgres-collector
2025-11-18 12:11:33,281 INFO  [io.st.op.conciliation] (SGConfig-ReconciliationLoop) SGConfig stackgres.stackgres-operator it's not up to date. Reconciling
2025-11-18 12:11:33,281 INFO  [io.st.op.conciliation] (SGConfig-ReconciliationLoop) Creating PodMonitor stackgres.stackgres-collector

PodMonitor CR stackres-collector

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  creationTimestamp: '2025-11-17T15:58:57Z'
  generation: 1
  labels:
    app: StackGresConfig
    release: prometheus
    stackgres.io/config-name: stackgres-operator
    stackgres.io/config-namespace: stackgres
    stackgres.io/config-uid: 29578d81-26d8-453a-8cef-4baa401110fe
  name: stackgres-collector
  namespace: stackgres
  ownerReferences:
    - apiVersion: stackgres.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: SGConfig
      name: stackgres-operator
      uid: 29578d81-26d8-453a-8cef-4baa401110fe
  resourceVersion: '43936'
  uid: 51f4de0c-0087-4b7c-b1be-751b727f169e
  selfLink: >-
    /apis/monitoring.coreos.com/v1/namespaces/stackgres/podmonitors/stackgres-collector
spec:
  namespaceSelector:
    matchNames:
      - stackgres
  podMetricsEndpoints:
    - honorLabels: true
      honorTimestamps: true
      path: /metrics
      port: prom-http
      scheme: https
      tlsConfig:
        ca:
          secret:
            key: tls.crt
            name: stackgres-collector
        serverName: stackgres-collector
  selector:
    matchLabels:
      app: StackGresConfig
      stackgres.io/collector: 'true'
      stackgres.io/config-name: stackgres-operator
      stackgres.io/config-uid: 29578d81-26d8-453a-8cef-4baa401110fe

stackgres-collector pod:

apiVersion: v1
kind: Pod
metadata:
  name: stackgres-collector-74d5dc9684-j29zr
  generateName: stackgres-collector-74d5dc9684-
  namespace: stackgres
  uid: 0f329d9f-59f1-4479-b5c3-accccddebb13
  resourceVersion: '467415'
  generation: 1
  creationTimestamp: '2025-11-18T11:38:34Z'
  labels:
    app: StackGresConfig
    pod-template-hash: 74d5dc9684
    stackgres.io/collector: 'true'
    stackgres.io/config-name: stackgres-operator
    stackgres.io/config-uid: 29578d81-26d8-453a-8cef-4baa401110fe
  ownerReferences:
    - apiVersion: apps/v1
      kind: ReplicaSet
      name: stackgres-collector-74d5dc9684
      uid: 59c90cee-67de-40c9-9540-47938714d3b3
      controller: true
      blockOwnerDeletion: true
  selfLink: /api/v1/namespaces/stackgres/pods/stackgres-collector-74d5dc9684-j29zr
status:
  phase: Running
  conditions:
    - type: PodReadyToStartContainers
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-11-18T11:38:52Z'
    - type: Initialized
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-11-18T11:38:34Z'
    - type: Ready
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-11-18T11:38:53Z'
    - type: ContainersReady
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-11-18T11:38:53Z'
    - type: PodScheduled
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-11-18T11:38:34Z'
  hostIP: 172.31.93.173
  hostIPs:
    - ip: 172.31.93.173
  podIP: 172.31.93.2
  podIPs:
    - ip: 172.31.93.2
  startTime: '2025-11-18T11:38:34Z'
  containerStatuses:
    - name: stackgres-collector
      state:
        running:
          startedAt: '2025-11-18T11:38:52Z'
      lastState: {}
      ready: true
      restartCount: 0
      image: quay.io/ongres/otel-collector:v0.136.0-build-6.44
      imageID: >-
        quay.io/ongres/otel-collector@sha256:4f8f74da6325d7397b65fd2c819d0539741a74b93ea282d40cea9f44974ccc0a
      containerID: >-
        containerd://c9591bf441c8429c2da4558fa3f738d929287027dad906c814a4520eae64a77d
      started: true
      allocatedResources:
        cpu: 250m
        memory: 1Gi
      resources:
        limits:
          cpu: '1'
          memory: 4Gi
        requests:
          cpu: 250m
          memory: 1Gi
      volumeMounts:
        - name: collector-certs
          mountPath: /etc/operator/certs
          readOnly: true
          recursiveReadOnly: Disabled
        - name: collector-config
          mountPath: /etc/collector
          readOnly: true
          recursiveReadOnly: Disabled
        - name: collector-scripts
          mountPath: /usr/local/bin/start-otel-collector.sh
          readOnly: true
          recursiveReadOnly: Disabled
        - name: kube-api-access-xtkg9
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          readOnly: true
          recursiveReadOnly: Disabled
  qosClass: Burstable
spec:
  volumes:
    - name: collector-certs
      secret:
        secretName: stackgres-operator-collector-certs
        defaultMode: 420
        optional: false
    - name: collector-scripts
      configMap:
        name: stackgres-collector
        items:
          - key: start-otel-collector.sh
            path: start-otel-collector.sh
        defaultMode: 420
        optional: false
    - name: collector-config
      configMap:
        name: stackgres-collector
        items:
          - key: config.yaml
            path: config.yaml
        defaultMode: 420
        optional: false
    - name: kube-api-access-xtkg9
      projected:
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              name: kube-root-ca.crt
              items:
                - key: ca.crt
                  path: ca.crt
          - downwardAPI:
              items:
                - path: namespace
                  fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
        defaultMode: 420
  containers:
    - name: stackgres-collector
      image: quay.io/ongres/otel-collector:v0.136.0-build-6.44
      command:
        - /bin/bash
        - '-e'
        - /usr/local/bin/start-otel-collector.sh
      ports:
        - name: prom-http
          containerPort: 9464
          protocol: TCP
        - name: oltp-port
          containerPort: 4317
          protocol: TCP
      env:
        - name: HOME
          value: /tmp
        - name: OPERATOR_VERSION
          value: 1.17.4
        - name: COLLECTOR_CONFIG_PATH
          value: /etc/collector/config.yaml
      resources:
        limits:
          cpu: '1'
          memory: 4Gi
        requests:
          cpu: 250m
          memory: 1Gi
      volumeMounts:
        - name: collector-certs
          readOnly: true
          mountPath: /etc/operator/certs
        - name: collector-config
          readOnly: true
          mountPath: /etc/collector
        - name: collector-scripts
          readOnly: true
          mountPath: /usr/local/bin/start-otel-collector.sh
          subPath: start-otel-collector.sh
        - name: kube-api-access-xtkg9
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      livenessProbe:
        httpGet:
          path: /
          port: 13133
          scheme: HTTP
        initialDelaySeconds: 5
        timeoutSeconds: 10
        periodSeconds: 30
        successThreshold: 1
        failureThreshold: 3
      readinessProbe:
        httpGet:
          path: /
          port: 13133
          scheme: HTTP
        timeoutSeconds: 1
        periodSeconds: 2
        successThreshold: 1
        failureThreshold: 3
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      imagePullPolicy: IfNotPresent
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        runAsNonRoot: true
  restartPolicy: Always
  terminationGracePeriodSeconds: 30
  dnsPolicy: ClusterFirst
  serviceAccountName: stackgres-collector
  serviceAccount: stackgres-collector
  nodeName: i-0e02cbe72916c766c
  shareProcessNamespace: true
  securityContext:
    runAsNonRoot: true
    fsGroup: 1000
  schedulerName: default-scheduler
  tolerations:
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
  priority: 0
  enableServiceLinks: true
  preemptionPolicy: PreemptLowerPriority