Unsafe use - redis needs safer atomic operations cluster-wide for test/set operations.

Unsafe use - redis needs safer atomic operations cluster-wide for test/set operations.

Example - file: lib/

file: lib/workhorse.rb

      def set_key_and_notify(key, value, expire: nil, overwrite: true)
        Gitlab::Redis.with do |redis|
          result = redis.set(key, value, ex: expire, nx: !overwrite)
          if result
            redis.publish(NOTIFICATION_CHANNEL, "#{key}=#{value}")
            value
          else
            redis.get(key)
          end
        end
      end

Context:

  1. The current gitlab.com redis cluster is using sentinel with single-master and multiple slaves
  2. there are edge cases where, during failover, the atomic operations will not produce the intended result. In the example above, consider what happens when a failover occurs after ".set", and before a competing ".set" from another process. In this case, two notifications can be sent instead of one.
  3. propose to support:
    1. deployment: redis_cluster with multiple master + failover slaves (this is a deployment consideration) - have a look at this fairly thorough discussion: Redis HA evolution Discussion
    2. application: implement redlock algorithm in the application to support atomicity even during failover scenarios.

/cc @ernstvn - there is a deployment aspect of this for the infrastructure team to truly load-balance the redis cluster with multiple masters

see related: https://gitlab.com/gitlab-com/infrastructure/issues/1575, gitlab-ce#30392

Assignee Loading
Time tracking Loading