Log error in ActiveSupport cache via error_handler
What does this MR do and why?
Introduces an error handler into ActiveSupport::Cache::RedisCacheStore instances so that errors are not hidden in failsafe. This allows us to investigate spikes in the gitlab_redis_client_exceptions_total counter in the RedisCacheStore setting since no error logs would have shown otherwise.
https://github.com/rails/rails/blob/v6.1.7.2/activesupport/lib/active_support/cache/redis_cache_store.rb#L59 uses ActiveSupport's logger but in this MR, we opt to use Gitlab::ErrorTracking.log_exception instead.
Refer to https://api.rubyonrails.org/classes/ActiveSupport/Cache/RedisCacheStore.html and https://github.com/rails/rails/blob/main/guides/source/caching_with_rails.md#activesupportcacherediscachestore for example of error_handler usage.
See gitlab-com/gl-infra/scalability#2070 (comment 1276392962)
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
How to set up and validate locally
- Update
config/redis.cluster_rate_limiting.yml'sdevelopmentkey toredis://127.0.0.1:6379but do not start a Redis server - Run
gdk rails c - Run
Gitlab::Redis::RateLimiting.cache_store.increment('xx') - Using
tail log/exceptions_json.log, we can get
{"severity":"ERROR","time":"2023-02-13T13:14:53.838Z","correlation_id":null,"exception.class":"Redis::CannotConnectError","exception.message":"Error connecting to Redis on 127.0.0.1:6379 (Errno::ECONNREFUSED)","exception.backtrace":["lib/gitlab/instrumentation/redis_interceptor.rb:10:in `block in call'","lib/gitlab/instrumentation/redis_interceptor.rb:41:in `instrument_call'","lib/gitlab/instrumentation/redis_interceptor.rb:9:in `call'","lib/gitlab/redis/multi_store.rb:129:in `block (2 levels) in \u003cclass:MultiStore\u003e'","(pry):1:in `__pry__'"],"user.username":null,"tags.program":"console","tags.locale":"en","tags.feature_category":null,"tags.correlation_id":null,"extra.method":"increment","extra.returning":"nil"}
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.