Add guard clause to the redis cookbook to prevent `gitlab-ctl reconfigure` to be called when running chef client
This way we can reenable chef-client in the redis (and database) hosts, solving this problem.
We do need to have a way of actually updating the application on these hosts though.
Previous issue:
Redis shouldn't restart automatically
2016-11-18 redis1 and redis2 lost sync
Timeline:
- 2016-11-18 09:28 UTC - redis1 received SIGTERM
- 2016-11-18 10:06 UTC - redis2 lost connection
We had to resync redis2 manually.
Right before all this @eReGeBe deployed 8.14 RC3, then we lost replication sync to PostgreSQL and then Redis followed. Was this all a coincidence?
How did the SIGTERM event happen? My first guess is that chef-client ran and caused Redis to restart. That seems poor.
redis1 logs:
2016-11-18_00:09:28.87444 redis1 redis: 53369:signal-handler (1479427768) Received SIGTERM scheduling shutdown...
2016-11-18_00:09:28.97236 redis1 redis: 53369:M 18 Nov 00:09:28.972 # User requested shutdown...
2016-11-18_00:09:28.97260 redis1 redis: 53369:M 18 Nov 00:09:28.972 * Saving the final RDB snapshot before exiting.
2016-11-18_00:10:05.14815 redis1 redis: 53369:M 18 Nov 00:10:05.148 * DB saved on disk
2016-11-18_00:10:05.14832 redis1 redis: 53369:M 18 Nov 00:10:05.148 * Removing the pid file.
2016-11-18_00:10:05.15107 redis1 redis: 53369:M 18 Nov 00:10:05.151 # Redis is now ready to exit, bye bye…
redis2 logs:
2016-11-18_00:10:06.50786 redis2 redis: 30878:S 18 Nov 00:10:06.507 # Connection with master lost.
2016-11-18_00:10:06.50816 redis2 redis: 30878:S 18 Nov 00:10:06.508 * Caching the disconnected master state.
2016-11-18_00:10:07.48812 redis2 redis: 30878:S 18 Nov 00:10:07.488 * Connecting to MASTER 10.x.x.x:6379
2016-11-18_00:10:07.48845 redis2 redis: 30878:S 18 Nov 00:10:07.488 * MASTER <-> SLAVE sync started
2016-11-18_00:10:07.48902 redis2 redis: 30878:S 18 Nov 00:10:07.489 * Non blocking connect for SYNC fired the event.
2016-11-18_00:10:07.50454 redis2 redis: 30878:S 18 Nov 00:10:07.504 * Master replied to PING, replication can continue...
2016-11-18_00:10:07.57944 redis2 redis: 30878:S 18 Nov 00:10:07.579 * Trying a partial resynchronization (request 461a73a5f2a4f34b3f31e7277a2270d2d6b84a13:179724561806).
2016-11-18_00:10:07.60806 redis2 redis: 30878:S 18 Nov 00:10:07.608 # Unexpected reply to PSYNC from master: -LOADING Redis is loading the dataset in memory
2016-11-18_00:10:07.60822 redis2 redis: 30878:S 18 Nov 00:10:07.608 * Discarding previously cached master state.
2016-11-18_00:10:07.60844 redis2 redis: 30878:S 18 Nov 00:10:07.608 * Retrying with SYNC...
2016-11-18_00:10:07.63160 redis2 redis: 30878:S 18 Nov 00:10:07.631 # MASTER aborted replication with an error: LOADING Redis is loading the dataset in memory