CPU limits in kubernetes

Hi Team!

As some of you are probably aware, there was a nasty bug in the Linux scheduler resulting in throttling/performance issues of all processes, even those not reaching their cgroup limits: https://medium.com/omio-engineering/cpu-limits-and-aggressive-throttling-in-kubernetes-c5b20bd8a718 This had a negative impact on k8s workloads which declared limits, because they are implemented using cgroups.

We were investigating this bug today and came across a patch that was merged to 4.18: https://github.com/torvalds/linux/commit/512ac999d2755d2b7109e996a76b6fb8b888631d

Both gstg and gprd GKE clusters are running on CoreOS version that's using 4.19. We tested the reproduction procedure from the snippet linked above and it appears that our GKE clusters should be free of this issue.

Speaking more broadly, do we have a set of practices for how to limit workloads in kubernetes? Is there anything else that is preventing us from using limits? Or is it now (that the bug if fixed) safe to use CPU limits?