Skip to content

SGDbOps may get stuck running in some cases

Summary

When a timeout trying to update the status of the SGDbOps fails the operation remains stuck without never running or in a forever running state.

Current Behavior

When the Job associated to the SGDbOps fails the status of the SGDbOps is never changed.

Steps to reproduce

  1. Create a cluster
  2. Create a restart SGDbOps
  3. After the SGDbOps has started running create a NetworkPolicy that blocks connections from the SGDbOps Pod to the Kubernetes API.

Expected Behavior

When the Job associated to the SGDbOps fails the status of the SGDbOps hat to be changed to failed.

Possible Solution

The operator should detect the failed Job associated to the SGDbOps and, if the status of the SGDbOps does not reflect the status of the Job, it should set the status of the SGDbOps.

Environment

  • StackGres version: 1.1.0
  • Kubernetes version: *
  • Cloud provider or hardware configuration:
Edited by Matteo Melli