Skip to content

mm/demotion: Memory tiers and demotion

Rafael Aquini requested to merge raquini/centos-stream-9:bz2186559 into main

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2186559

The automatic generation of node migration order lists to introduce
the capability of having the least referenced memory demoted to
"lower memory tiers" was introduced upstream circa v5.15 and backported
into RHEL-9.0 [1]. However, this tier-ization was introduced in a suboptimal
fashion, limiting the number of slots in the node demotion list and only
implicitly conveying the idea of having proper memory tiers as it's becoming
more common nowadays with memory systems being composed of multiple kinds
of memory (HBM, PMEM, DRAM, ...).

This merge brings into RHEL-9 the upstream v6.1 "mm/demotion: Memory tiers
and demotion" patch series in order to provide us with proper explicit
memory tiers as well as to address some of the arbitrary hard limits that
were part of the origial implementation introduced with [1].

Testing will be performed as a joint venture between MM and VIRT QE,
and preliminary verifications are already documented at [2]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2023396
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2186559

Rafael Aquini (18):
mm/demotion: add support for explicit memory tiers
mm/demotion: move memory demotion related code
mm/demotion: add hotplug callbacks to handle new numa node onlined
mm/demotion/dax/kmem: set node's abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE
mm/demotion: build demotion targets based on explicit memory tiers
mm/demotion: add pg_data_t member to track node memory tier details
mm/demotion: drop memtier from memtype
mm/demotion: demote pages according to allocation fallback order
mm/demotion: update node_is_toptier to work with memory tiers
mm/demotion: expose memory tier details via sysfs
mm/demotion: fix NULL vs IS_ERR checking in memory_tier_init
memory tier, sysfs: rename attribute "nodes" to "nodelist"
memory tier: release the new_memtier in find_create_memory_tier()
lib/nodemask: optimize node_random for nodemask with single NUMA node
lib/kstrtox.c: add "false"/"true" support to kstrtobool()
arm64/mm: fold check for KFENCE into can_set_direct_map()
arm64: fix rodata=full
arm64: fix rodata=full again

Signed-off-by: Rafael Aquini aquini@redhat.com

Edited by Rafael Aquini

Merge request reports