Skip to content

Draft: Introduce descheduler

Bogdan Antohe requested to merge bantohe-test into main

What does this MR do and why?

As a last part of issue #563, this MR will introduce a new unit for k8s descheduler and the aim of it is to evenly spread the load into cluster. In the current values, I chose to run the application as a deployment ( we could deploy as cronjob as well) and the strategy defined here is based on LowNodeUtilization it means that once a node is under utilization, from other nodes recreation of evicted pods will be scheduled on these underutilized nodes. (The under utilization of nodes is determined by a configurable threshold thresholds, and the filed targetThresholds, that is used to compute those potential nodes from where pods could be evicted). Keep in mind, there are some restrictions in terms of pods eviction, for example static pods, pods associated with DaemonSets or localstorage will not be evicted by default.

Into this MR I set just potential default values for descheduler, but of course if some guys have other suggestions.

Related reference(s)

Test coverage

Descheduler logs for an initialcluster with 3 cp:


I0516 12:25:40.172957       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-tfs8x" usage={"cpu":"3405m","memory":"4426Mi","pods":"41"} usagePercentage={"cpu":85.13,"memory":27.68,"pods":37.27}
I0516 12:25:40.172970       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-ndf85" usage={"cpu":"2960m","memory":"3924Mi","pods":"44"} usagePercentage={"cpu":74,"memory":24.54,"pods":40}
I0516 12:25:40.172980       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-spm7v" usage={"cpu":"2575m","memory":"4344Mi","pods":"42"} usagePercentage={"cpu":64.38,"memory":27.17,"pods":38.18}
I0516 12:25:40.172988       1 lownodeutilization.go:134] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20
I0516 12:25:40.172994       1 lownodeutilization.go:135] "Number of underutilized nodes" totalNumber=0
I0516 12:25:40.173057       1 lownodeutilization.go:148] "Criteria for a node above target utilization" CPU=60 Mem=60 Pods=50
I0516 12:25:40.173062       1 lownodeutilization.go:149] "Number of overutilized nodes" totalNumber=3

Adding 2 new nodes as worker:

I0516 12:30:40.173955       1 profile.go:323] "Total number of pods evicted" extension point="Deschedule" evictedPods=0
I0516 12:30:40.174231       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-spm7v" usage={"cpu":"2575m","memory":"4344Mi","pods":"42"} usagePercentage={"cpu":64.38,"memory":27.17,"pods":38.18}
I0516 12:30:40.174249       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-tfs8x" usage={"cpu":"3405m","memory":"4426Mi","pods":"41"} usagePercentage={"cpu":85.13,"memory":27.68,"pods":37.27}
I0516 12:30:40.174296       1 nodeutilization.go:204] "Node is underutilized" node="test-mgmt-1-md-md0-b3eebd90a4-x4hbm" usage={"cpu":"350m","memory":"218Mi","pods":"6"} usagePercentage={"cpu":8.75,"memory":1.36,"pods":5.45}
I0516 12:30:40.174319       1 nodeutilization.go:204] "Node is underutilized" node="test-mgmt-1-md-md0-b3eebd90a4-xft5h" usage={"cpu":"350m","memory":"218Mi","pods":"5"} usagePercentage={"cpu":8.75,"memory":1.36,"pods":4.55}
I0516 12:30:40.174337       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-ndf85" usage={"cpu":"2960m","memory":"3924Mi","pods":"44"} usagePercentage={"cpu":74,"memory":24.54,"pods":40}
I0516 12:30:40.174364       1 lownodeutilization.go:134] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20
I0516 12:30:40.174376       1 lownodeutilization.go:135] "Number of underutilized nodes" totalNumber=2
I0516 12:30:40.174391       1 lownodeutilization.go:148] "Criteria for a node above target utilization" CPU=60 Mem=60 Pods=50
I0516 12:30:40.174400       1 lownodeutilization.go:149] "Number of overutilized nodes" totalNumber=3
I0516 12:30:40.174438       1 nodeutilization.go:260] "Total capacity to be moved" CPU=4100 Mem=19661026918 Pods=99



I0516 12:35:40.174712       1 nodeutilization.go:210] "Node is appropriately utilized" node="test-mgmt-1-cp-05fc5f81f9-ndf85" usage={"cpu":"2350m","memory":"3248Mi","pods":"31"} usagePercentage={"cpu":58.75,"memory":20.31,"pods":28.18}
I0516 12:35:40.174731       1 nodeutilization.go:210] "Node is appropriately utilized" node="test-mgmt-1-cp-05fc5f81f9-spm7v" usage={"cpu":"2375m","memory":"4030Mi","pods":"32"} usagePercentage={"cpu":59.38,"memory":25.21,"pods":29.09}
I0516 12:35:40.174906       1 nodeutilization.go:207] "Node is overutilized" node="test-mgmt-1-cp-05fc5f81f9-tfs8x" usage={"cpu":"3175m","memory":"3978Mi","pods":"24"} usagePercentage={"cpu":79.38,"memory":24.88,"pods":21.82}
I0516 12:35:40.174924       1 nodeutilization.go:210] "Node is appropriately utilized" node="test-mgmt-1-md-md0-b3eebd90a4-x4hbm" usage={"cpu":"950m","memory":"916Mi","pods":"24"} usagePercentage={"cpu":23.75,"memory":5.73,"pods":21.82}
I0516 12:35:40.174934       1 nodeutilization.go:210] "Node is appropriately utilized" node="test-mgmt-1-md-md0-b3eebd90a4-xft5h" usage={"cpu":"790m","memory":"958Mi","pods":"27"} usagePercentage={"cpu":19.75,"memory":5.99,"pods":24.55}
Edited by Bogdan Antohe

Merge request reports