Replication lag increased during migrations of 11.0 RC5 deploy
During a migration of 11.0 RC5, we had the following issues with migrations:
- Migration
20180408143354 RenameUsersRssTokenToFeedToken
: Adding column timed out https://gitlab.slack.com/archives/C101F3796/p1528359556000050 - Concurrent column rename caused high replication lag (one secondary postgres-03 went off from LB)
- Updated
users
table in batches of 1,000 (with around 2,400,000 users)
- Updated
MR for the migration is here: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17783
@marin asked during the call: "What is an acceptable replication lag and when do we need to take (what) action if it's too high?"
Edited by Stan Hu