Redis-cache tcpdump analysis of mget calls
It's time for another round of redis traffic analysis, triggered by some of the instrumentation being added in gitlab-org/gitlab!104562 (merged).
Setup
Capturing traffic (receive side only):
iwiedler@redis-cache-02-db-gprd.c.gitlab-production.internal:/var/log/pcap-iwiedler$ sudo tcpdump -G 30 -W 1 -s 65535 tcp dst port 6379 -w redis.pcap -i ens4
Basic processing:
➜ find redis-analysis -name '*.06379.findx' | GITLAB_REDIS_CLUSTER=cache parallel -j0 -n100 ruby ~/code/runbooks/scripts/redis_trace_cmd.rb | sed '/^$/d' > trace.txt
➜ sort -o trace.txt trace.txt
➜ cat trace.txt | grep mget | wc -l
418874
I then applied a few modifications to redis_trace_cmd.rb for further processing.
mget
argument count
Distribution of Modification:
puts "#{args.size-1}" if cmd == "mget"
Analysis:
➜ find redis-analysis -name '*.06379.findx' | GITLAB_REDIS_CLUSTER=cache parallel -j0 -n100 ruby ~/code/runbooks/scripts/redis_trace_cmd.rb | sed '/^$/d' > mget.keys.txt
➜ cat mget.keys.txt | log2hist
[1] 16893 ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
[2, 4) 1562 ∎∎∎∎
[4, 8) 1393 ∎∎∎∎
[8, 16) 1075 ∎∎∎
[16, 32) 2957 ∎∎∎∎∎∎∎∎∎
[32, 64) 446 ∎
[64, 128) 1362 ∎∎∎∎
[128, 256) 89
[256, 512) 161
[512, 1024) 45
[1024, 2048) 0
[2048, 4096) 0
[4096, 8192) 1
The majority of calls only have a single argument. But then we have plenty with ~20 args, and quite a few with 100. Then a couple at the top with >500. And finally a single chonker that made an mget
on 5905 keys.
Key pattern distribution
Modification:
data = {
time: ts.iso8601(9),
cmd: cmd,
src_host: src_host,
keys: keys,
patterns: keys.map { |key| RedisTrace::KeyPattern.filter_key(key).gsub(' ', '_') },
patterns_uniq: keys.map { |key| RedisTrace::KeyPattern.filter_key(key).gsub(' ', '_') }.sort.uniq,
}
puts data.to_json
Analysis:
➜ find redis-analysis -name '*.06379.findx' | GITLAB_REDIS_CLUSTER=cache parallel -j0 -n100 ruby ~/code/runbooks/scripts/redis_trace_cmd.rb | sed '/^$/d' > trace.mget.json
➜ cat trace.mget.json | jq | head
➜ cat trace.mget.json | jq 'select(.patterns_uniq|length > 1)|.patterns_uniq'
➜ cat trace.mget.json | jq -sc 'sort_by(.keys|length)[]' > trace.mget.sorted.json
➜ cat trace.mget.sorted.json | jq 'select(.keys|length >= 100)|select(.patterns_uniq|length > 1)|[(.keys|length), .src_host, .patterns_uniq]'
Patterns for the huge outlier:
➜ tail -n1 trace.mget.sorted.json | jq '.patterns_uniq'
[
"cache:gitlab:Class:tag:$LONGHASH:projects/$NUMBER-$NUMBER"
]
Outliers with more than 500 keys:
➜ cat trace.mget.sorted.json | jq 'select(.keys|length >= 500)|[(.keys|length), .patterns_uniq]'
[
600,
[
"cache:gitlab:avatar:$PATTERN",
"cache:gitlab:exists?:$PATTERN",
"cache:gitlab:has_visible_content?:$PATTERN",
"cache:gitlab:readme_path:$PATTERN",
"cache:gitlab:root_ref:$PATTERN"
]
]
[
600,
[
"cache:gitlab:avatar:$PATTERN",
"cache:gitlab:exists?:$PATTERN",
"cache:gitlab:has_visible_content?:$PATTERN",
"cache:gitlab:readme_path:$PATTERN",
"cache:gitlab:root_ref:$PATTERN"
]
]
[
625,
[
"cache:gitlab:avatar:$PATTERN",
"cache:gitlab:exists?:$PATTERN",
"cache:gitlab:has_visible_content?:$PATTERN",
"cache:gitlab:readme_path:$PATTERN",
"cache:gitlab:root_ref:$PATTERN"
]
]
Conclusions
None yet. Still need to dig a bit more. But this may point to some places we need to address for #2004 (closed).
Edited by Igor