Skip to content

Use SGObjectStorage CRD for Postgres base backup and WAL storage

The proposed CRD is basically a small evolution over the currently existing SGBackupConfig.storage subobject. The main goal of extracting it into a separate CRD is for being able to re-use and reference it from different contexts.

This would enable to support clusters initialized from a Postgres backup and WAL archive. Having a CRD that references this archive, there could be two use cases perfectly supported by this CRD:

  • A user wants to have a cluster replicating from the archive, and there's an existing CRD representing that archive within the same K8s cluster (and namespace). In this case, a reference to the existing CRD should be enough.
  • A user wants to have a cluster replicating from the archive, on a different K8s cluster (i.e. a DR scenario). In this case, a local (to the destination K8s cluster) CRD needs to be created, but could have the same or similar contents from the source CRD, which could be easily copied, and then referenced.

The proposed name for this new CRD is SGObjectStorage.

More precisely, this change would imply the following changes:

  1. Extract the storage subobject from the SGBackupConfig CRD and "upgrade" it to a full CRD called SGObjectStorage. (See #1564 (closed))
  2. Change the SGBackupConfig to include a new property, .spec.sgObjectStorage, that will become a reference to the new CRD.
  3. Add a new property to the new CRD, called SGObjectStorage.mode, which will be an enum with possible values RW, RO. This would be useful for standby clusters, which might be provided with more restricted user access credentials.
  4. Create the section .spec.configurations.backups with an array (only allowed to have just 1 element) that will replace the field .spec.configurations.sgBackupConfig including a reference to an SGObjectStorage and the other properties of SGBackupConfig.
  5. Deprecate the field .spec.configurations.sgBackupConfig in SGCluster CRD and forbid create new SGCluster using such field.
  6. Deprecate SGBackupConfig CRD and forbid creation of SGBackupConfig CRs.

Note that this scheme also allows for creating standby clusters processing WALs from an object storage that might not be the original source one, but possibly one replicated in other region (eg. an S3 bucket replicated to another S3 bucket on another region), in which case source and destination CRDs will only differ on the bucket name and region (and potentially, user credentials).

Also note that this issue doesn't deal with how to modify a SGCluster to support standby cluster creation. That will be addressed on a separate issue (see #866 (closed)).

Ideally, some strong validation support should be built into the operator to validate access credentials when creating the CRD (i.e., validate that we can read or read and write to the object storage). This will improve the current user experience of creating an invalid SGBackupConfig.

Proposed change to SGCluster:

spec:
  configurations:
    backupPath: <string>
    sgBackupConfig: <string> # this does not change but will be deprecated and mutually exclusive with `backups` field.
    backups: # The backups is an array that can contain at most 1 element to allow in the future specify multiple backup configuration so that backup may be stored on multiple storages.
    - path: <string>
      compression: <string>
      cronSchedule: <string>
      performance:
        maxDiskBandwitdh: <integer>
        maxNetworkBandwitdh: <integer>
        uploadDiskConcurrency: <integer>
      retention: <integer>
      sgObjectStorage: <string> # name of an SGObjectStorage in the same namespace

Acceptance criteria:

  • Implement a mutating webhook in order to migrate from SGBackupConfig + backupPath to SGObjectStorage + new backups section in SGCluster
  • Implement a validating webhook in order to disallow usage of SGBackupConfig in SGCluster
Edited by Jorge Solorzano