Improve observability on cross-slot operations in application
RedisClusterValidator
is currently used in the application to catch cross-slot errors in test
and development
. We need to better understand the current state of the application with respect to uncaught cross-slot errors (missed in test cases but ignored in staging/production) before we can migrate to a Redis Cluster setup with confidence.
Current list of pending improvements
-
Track errors in staging and production via logging (exclude any operations in allow_cross_slot_commands
block) -
Catch cross-slot errors in pipelined operations and transactions ( MULTI
) -
Add cross-slot operation counts in allow_cross_slot_commands
to log output
Feature-flags introduced to control increased observability
*
validate_allowed_cross_slow_commands
: validating all allowed cross-slot commands will lead to increase in CRC16 hash being calculated for every key. Having a feature-flag will allow roll-back if cpu apdex is affected.* log_cross_slot_commands
: there could be a huge increase in log volume. This allows us to toggle the log output.
Having feature flags in Redis instrumentation layer is a tad risky since recursive calls (feature flag makes redis lookups) could occur.