Skip to content

stackgres restore fails to start cluster

Recovery sgcluster from s3 bucket backup will fails to start sgcluster v1.5

Steps to reproduce: Make sure that backup of existing cluster is exist. Delete cluster (helm uninstall). sgbackup resource is still there after uninstall: NAME CLUSTER MANAGED STATUS sgtest-2023-09-19-01-00-05 sgtest true Completed Create cluster with the same name/config and added initialsetup section:

 initialData:                       
    restore:    
      fromBackup:
        name: sgtest-2023-09-19-01-00-05

the patroni container will fails to start:

      /bin/sh                                                                                                                                                                   
      -ex                                                                                                                                                                       
      /usr/local/bin/start-patroni.sh                                                                                                                                           
    State:          Waiting                                                                                                                                                     
      Reason:       CrashLoopBackOff                                                                                                                                            
    Last State:     Terminated                                                                                                                                                  
      Reason:       Error                                                                                                                                                       
      Exit Code:    1                                                                                                                                                           
      Started:      Tue, 19 Sep 2023 13:59:29 +0800                                                                                                                             
      Finished:     Tue, 19 Sep 2023 13:59:32 +0800                                                                                                                             
    Ready:          False    

/usr/local/bin/start-patroni.sh 
2023-09-19 06:33:37,996 INFO: Lock owner: None; I am sgtest-0
2023-09-19 06:33:38,221 INFO: trying to bootstrap a new cluster
2023-09-19 06:33:38,221 INFO: Running custom bootstrap script: exec-with-env "restore" -- /etc/patroni/recovery-from-backup
INFO: 2023/09/19 06:33:38.490194 Selecting the backup with name base_000000010000000000000002...
ERROR: 2023/09/19 06:33:38.501927 Backup 'base_000000010000000000000002' does not exist.
2023-09-19 06:33:38,505 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/bin/patroni", line 8, in <module>
	sys.exit(main())
  File "/usr/lib/python3.9/site-packages/patroni/main.py", line 144, in main
	return patroni_main()
  File "/usr/lib/python3.9/site-packages/patroni/main.py", line 136, in patroni_main
	abstract_main(Patroni, schema)
  File "/usr/lib/python3.9/site-packages/patroni/daemon.py", line 181, in abstract_main
	controller.run()
  File "/usr/lib/python3.9/site-packages/patroni/main.py", line 106, in run
	super(Patroni, self).run()
  File "/usr/lib/python3.9/site-packages/patroni/daemon.py", line 126, in run
	self._run_cycle()
  File "/usr/lib/python3.9/site-packages/patroni/main.py", line 109, in _run_cycle
	logger.info(self.ha.run_cycle())
  File "/usr/lib/python3.9/site-packages/patroni/ha.py", line 1770, in run_cycle
	info = self._run_cycle()
  File "/usr/lib/python3.9/site-packages/patroni/ha.py", line 1592, in _run_cycle
	return self.post_bootstrap()
  File "/usr/lib/python3.9/site-packages/patroni/ha.py", line 1483, in post_bootstrap
	self.cancel_initialization()
  File "/usr/lib/python3.9/site-packages/patroni/ha.py", line 1476, in cancel_initialization
	raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'                              

Acceptance Criteria

expected to manage to recover sgcluster from backup.

Edited by Vladimir M