LDAP group sync clears membership if it temporarily cannot find the group in LDAP
Zendesk: https://gitlab.zendesk.com/agent/tickets/90532
In https://gitlab.com/gitlab-org/gitlab-ee/blob/02c153502619f088b0f4909d827862f7ebe4727e/ee/lib/ee/gitlab/ldap/sync/proxy.rb#L38-42, we try to find an LDAP group by CN. If for some reason we cannot find that group (network connection fails, timeout occurs, etc.) then we log a warning that says we're skipping that group.
In reality, we don't skip it at all. On the next line we return an empty array. That empty array is treated no differently than a group without any members. This causes group sync to then remove all group members if there were any previously.
The customer experienced this problem when the group query took longer than the timeout on occasion. Their temporary workaround is to increase the timeout.
After looking at this for a moment, the problem is tricky. Where should we fix it? Ultimately, the underlying problem is really https://gitlab.com/gitlab-org/gitlab-ee/issues/174. We shouldn't silently fail if a timeout occurs. The other thought against 'fixing' this bug inside group sync is what happens if a group really is deleted from LDAP? Wouldn't we want to revoke the membership on the GitLab side? We want LDAP to be the single source of truth.