Make thresholds of CPU and Memory watchers configurable
In the current implementation, the adaptive limiting kicks in when the resource level exceeds some hard-coded thresholds:
- 90% of the parent cgroup's memory: source.
- Cgroup's cpu is throttled for 50% of the observation time: source.
Although the current CPU throttled threshold is reasonable, it might not be good for all cases. A more powerful machine can tolerate a higher throttling rate while a less powerful machine wants to lower the limit sooner. This commit adds the ability to customize the CPU throttled threshold.
In a recent incident, the limiter worked but it was triggered a bit late. When the memory level reaches 90%, the memory headroom might be tight. The inflight operations (usually expensive) can fill up the rest very quickly.
When the memory level reaches 100%, a lot of weird things might occur, such as high memory pressure leading to major page faults, failed memory allocations, high iowait (because of page faults), OOM killing, etc. There's a chance that inflight requests cannot finish at this stage. So, it makes sense to increase this headroom by decreasing the threshold.