Skip to content

blk-cgroup: Fix potential lockup in blkcg_rstat_flush()

Waiman Long requested to merge llong1/centos-stream-9:bz2077665_blkcg into main

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2077665
MR: !1861 (merged)

On system with a large number of block devices, it is possible that blkcg_rstat_flush() will take a long time to execute because of the time needed to iterate all the per-cpu blkg_iostat_set structures. This can lead to hard lockup in some extreme cases since interrupt was disabled in the iteration process.

This is fixed by keeping track of the set of updated blkg_iostat_set structures in a lockless list and only iterate those in blkcg_rstat_flush().

The last 2 patches in the series does that. There are also some minor fixes to them, but they have not yet been merged upstream and so are not included.

This MR also contains some other miscellaneous cgroup fixes and performance enhancements that may help to alleviate this particular problem.

Signed-off-by: Waiman Long longman@redhat.com

Edited by Waiman Long

Merge request reports