Not recover a node after failover

After failover the SG does not recover a node:

anthony@anthony-HP-EliteBook-840-G4:~/Trabajo/OnGres/ongres_repo/cope/deploy/pgbouncer_upgrade$ kubectl exec -it -n sg-dev mminventory-pro-dev-1 -c patroni -- bash
bash-4.4$ patronictl list
+ Cluster: mminventory-pro-dev (6863135752587546694) --+--------------+----+-----------+
|         Member        |         Host        |  Role  |    State     | TL | Lag in MB |
+-----------------------+---------------------+--------+--------------+----+-----------+
| mminventory-pro-dev-0 | 10.192.147.101:7433 |        |   running    |  7 |         0 |
| mminventory-pro-dev-1 | 10.192.146.100:7433 | Leader |   running    |  7 |           |
| mminventory-pro-dev-2 | 10.192.146.226:7433 |        | start failed |    |   unknown |

pod mminventory-pro-dev-2 postgresql's logs:

bash-4.4$ tail -f postgres-25.csv
2020-09-11 14:25:47.850 UTC,,,1795,,5f5b88eb.703,4,,2020-09-11 14:25:47 UTC,,0,LOG,00000,"database system is shut down",,,,,,,,,""
2020-09-11 14:25:59.285 UTC,,,1846,,5f5b88f7.736,1,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,""
2020-09-11 14:25:59.289 UTC,,,1849,,5f5b88f7.739,1,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"database system was shut down in recovery at 2020-09-11 06:15:09 UTC",,,,,,,,,""
2020-09-11 14:25:59.290 UTC,,,1849,,5f5b88f7.739,2,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"entering standby mode",,,,,,,,,""
2020-09-11 14:25:59.290 UTC,,,1850,"[local]",5f5b88f7.73a,1,"",2020-09-11 14:25:59 UTC,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,""
2020-09-11 14:25:59.290 UTC,"postgres","postgres",1850,"[local]",5f5b88f7.73a,2,"",2020-09-11 14:25:59 UTC,,0,FATAL,57P03,"the database system is starting up",,,,,,,,,""
2020-09-11 14:25:59.290 UTC,,,1849,,5f5b88f7.739,3,,2020-09-11 14:25:59 UTC,,0,FATAL,XX000,"requested timeline 7 is not a child of this server's history","Latest checkpoint is at 16/F6000028 on timeline 6, but in the history of the requested timeline, the server forked off from that timeline at 16/F50016B8.",,,,,,,,""
2020-09-11 14:25:59.291 UTC,,,1846,,5f5b88f7.736,2,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"startup process (PID 1849) exited with exit code 1",,,,,,,,,""
2020-09-11 14:25:59.291 UTC,,,1846,,5f5b88f7.736,3,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,,""
2020-09-11 14:25:59.342 UTC,,,1846,,5f5b88f7.736,4,,2020-09-11 14:25:59 UTC,,0,LOG,00000,"database system is shut down",,,,,,,,,""

Timelines

# master

postgres=# select substring(pg_walfile_name(pg_current_wal_lsn()), 1, 8);
 substring 
-----------
 00000007
(1 row)

# replica

bash-4.4$ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
      systemid       | timeline |   xlogpos   |  dbname  
---------------------+----------+-------------+----------
 6863135752587546694 |        7 | 17/26444380 | postgres
(1 row)

Solution: reinit the node using patronictl:

patronictl reinit mminventory-pro-dev mminventory-pro-dev-2

Versions: Kubernetes version 1.16 y StackGres version: 0.9 .

Edited Sep 14, 2020 by Matteo Melli