* [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
@ 2026-03-23 9:48 Donet Tom
2026-04-02 0:22 ` Andrew Morton
2026-04-02 3:27 ` Huang, Ying
0 siblings, 2 replies; 6+ messages in thread
From: Donet Tom @ 2026-03-23 9:48 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra
Cc: Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
Juri Lelli, Mel Gorman, Donet Tom
In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
disabled and the pages are on the lower tier, the pages may still be
promoted.
This happens because task_numa_work() updates the last_cpupid field to
record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
enabled and the folio is on the lower tier. If
NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
can retains a valid last CPU id.
In should_numa_migrate_memory(), the decision checks whether
NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
tier, and last_cpupid is invalid. However, the last_cpupid can be
valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
evaluates to false and migration is allowed.
This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
disabled and the folio is on the lower tier.
Behavior before this change:
============================
- If NUMA_BALANCING_NORMAL is enabled, migration occurs between
nodes within the same memory tier, and promotion from lower
tier to higher tier may also happen.
- If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
lower tier to higher tier nodes is allowed.
Behavior after this change:
===========================
- If NUMA_BALANCING_NORMAL is enabled, migration will occur only
between nodes within the same memory tier.
- If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
tier to higher tier nodes will be allowed.
- If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
enabled, both migration (same tier) and promotion (cross tier) are
allowed.
Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
v1 -> v2
========
1. Dropped changes in task_numa_fault() since the original changes
already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
---
kernel/sched/fair.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bf948db905ed..4b43809a3fb1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
+ /*
+ * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
+ * and the pages are on the lower tier.
+ */
if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
- !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
+ !node_is_toptier(src_nid))
return false;
/*
--
2.47.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled 2026-03-23 9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom @ 2026-04-02 0:22 ` Andrew Morton 2026-04-02 3:31 ` Huang, Ying 2026-04-02 3:27 ` Huang, Ying 1 sibling, 1 reply; 6+ messages in thread From: Andrew Morton @ 2026-04-02 0:22 UTC (permalink / raw) To: Donet Tom Cc: David Hildenbrand, Ingo Molnar, Peter Zijlstra, Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli, Mel Gorman, Rik van Riel, ying.huang, ying.huang On Mon, 23 Mar 2026 04:48:49 -0500 Donet Tom <donettom@linux.ibm.com> wrote: > In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is > disabled and the pages are on the lower tier, the pages may still be > promoted. > > This happens because task_numa_work() updates the last_cpupid field to > record the last access time only when NUMA_BALANCING_MEMORY_TIERING is > enabled and the folio is on the lower tier. If > NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field > can retains a valid last CPU id. > > In should_numa_migrate_memory(), the decision checks whether > NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower > tier, and last_cpupid is invalid. However, the last_cpupid can be > valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition > evaluates to false and migration is allowed. > > This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is > disabled and the folio is on the lower tier. > > Behavior before this change: > ============================ > - If NUMA_BALANCING_NORMAL is enabled, migration occurs between > nodes within the same memory tier, and promotion from lower > tier to higher tier may also happen. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from > lower tier to higher tier nodes is allowed. > > Behavior after this change: > =========================== > - If NUMA_BALANCING_NORMAL is enabled, migration will occur only > between nodes within the same memory tier. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower > tier to higher tier nodes will be allowed. > > - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are > enabled, both migration (same tier) and promotion (cross tier) are > allowed. There was no feedback on this, nor on your v1. > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") Ying Huang seems to have moved around a bit - let me add a couple more email addresses. Apologies if we have multiple Ying Huangs! Rik, Mel? It's a bugfix. Thanks. From: Donet Tom <donettom@linux.ibm.com> Subject: memory tiering: do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Date: Mon, 23 Mar 2026 04:48:49 -0500 In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is disabled and the pages are on the lower tier, the pages may still be promoted. This happens because task_numa_work() updates the last_cpupid field to record the last access time only when NUMA_BALANCING_MEMORY_TIERING is enabled and the folio is on the lower tier. If NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field can retains a valid last CPU id. In should_numa_migrate_memory(), the decision checks whether NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower tier, and last_cpupid is invalid. However, the last_cpupid can be valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition evaluates to false and migration is allowed. This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is disabled and the folio is on the lower tier. Behavior before this change: ============================ - If NUMA_BALANCING_NORMAL is enabled, migration occurs between nodes within the same memory tier, and promotion from lower tier to higher tier may also happen. - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower tier to higher tier nodes is allowed. Behavior after this change: =========================== - If NUMA_BALANCING_NORMAL is enabled, migration will occur only between nodes within the same memory tier. - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower tier to higher tier nodes will be allowed. - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are enabled, both migration (same tier) and promotion (cross tier) are allowed. Link: https://lkml.kernel.org/r/20260323094849.3903-1-donettom@linux.ibm.com Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Ben Segall <bsegall@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: "Huang, Ying" <huang.ying.caritas@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- kernel/sched/fair.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/kernel/sched/fair.c~memory-tiering-do-not-allow-promotion-if-numa_balancing_memory_tiering-is-disabled +++ a/kernel/sched/fair.c @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct t this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid); + /* + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled + * and the pages are on the lower tier. + */ if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) && - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid)) + !node_is_toptier(src_nid)) return false; /* _ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled 2026-04-02 0:22 ` Andrew Morton @ 2026-04-02 3:31 ` Huang, Ying 0 siblings, 0 replies; 6+ messages in thread From: Huang, Ying @ 2026-04-02 3:31 UTC (permalink / raw) To: Andrew Morton Cc: Donet Tom, David Hildenbrand, Ingo Molnar, Peter Zijlstra, Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli, Mel Gorman, Rik van Riel, ying.huang Hi, Andrew, Andrew Morton <akpm@linux-foundation.org> writes: > On Mon, 23 Mar 2026 04:48:49 -0500 Donet Tom <donettom@linux.ibm.com> wrote: > >> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is >> disabled and the pages are on the lower tier, the pages may still be >> promoted. >> >> This happens because task_numa_work() updates the last_cpupid field to >> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is >> enabled and the folio is on the lower tier. If >> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field >> can retains a valid last CPU id. >> >> In should_numa_migrate_memory(), the decision checks whether >> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower >> tier, and last_cpupid is invalid. However, the last_cpupid can be >> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition >> evaluates to false and migration is allowed. >> >> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is >> disabled and the folio is on the lower tier. >> >> Behavior before this change: >> ============================ >> - If NUMA_BALANCING_NORMAL is enabled, migration occurs between >> nodes within the same memory tier, and promotion from lower >> tier to higher tier may also happen. >> >> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from >> lower tier to higher tier nodes is allowed. >> >> Behavior after this change: >> =========================== >> - If NUMA_BALANCING_NORMAL is enabled, migration will occur only >> between nodes within the same memory tier. >> >> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower >> tier to higher tier nodes will be allowed. >> >> - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are >> enabled, both migration (same tier) and promotion (cross tier) are >> allowed. > > There was no feedback on this, nor on your v1. > >> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > > Ying Huang seems to have moved around a bit - let me add a couple more > email addresses. Apologies if we have multiple Ying Huangs! Thanks! I don't find other Ying Huang in mm community yet. Now I use the following email address: "Huang, Ying" <ying.huang@linux.alibaba.com> Ying Huang <huang.ying.caritas@gmail.com> and stop using the following email address: ying.huang@intel.com > Rik, Mel? It's a bugfix. > > Thanks. > > > > From: Donet Tom <donettom@linux.ibm.com> > Subject: memory tiering: do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled > Date: Mon, 23 Mar 2026 04:48:49 -0500 > > In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is > disabled and the pages are on the lower tier, the pages may still be > promoted. > > This happens because task_numa_work() updates the last_cpupid field to > record the last access time only when NUMA_BALANCING_MEMORY_TIERING is > enabled and the folio is on the lower tier. If > NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field can > retains a valid last CPU id. > > In should_numa_migrate_memory(), the decision checks whether > NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower tier, > and last_cpupid is invalid. However, the last_cpupid can be valid when > NUMA_BALANCING_MEMORY_TIERING is disabled, the condition evaluates to > false and migration is allowed. > > This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is > disabled and the folio is on the lower tier. > > Behavior before this change: > ============================ > - If NUMA_BALANCING_NORMAL is enabled, migration occurs between > nodes within the same memory tier, and promotion from lower > tier to higher tier may also happen. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from > lower tier to higher tier nodes is allowed. > > Behavior after this change: > =========================== > - If NUMA_BALANCING_NORMAL is enabled, migration will occur only > between nodes within the same memory tier. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower > tier to higher tier nodes will be allowed. > > - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are > enabled, both migration (same tier) and promotion (cross tier) are > allowed. > > Link: https://lkml.kernel.org/r/20260323094849.3903-1-donettom@linux.ibm.com > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > Signed-off-by: Donet Tom <donettom@linux.ibm.com> > Cc: Baolin Wang <baolin.wang@linux.alibaba.com> > Cc: Ben Segall <bsegall@google.com> > Cc: David Hildenbrand <david@kernel.org> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> > Cc: "Huang, Ying" <huang.ying.caritas@gmail.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Juri Lelli <juri.lelli@redhat.com> > Cc: Mel Gorman <mgorman@suse.de> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Valentin Schneider <vschneid@redhat.com> > Cc: Vincent Guittot <vincent.guittot@linaro.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > kernel/sched/fair.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > --- a/kernel/sched/fair.c~memory-tiering-do-not-allow-promotion-if-numa_balancing_memory_tiering-is-disabled > +++ a/kernel/sched/fair.c > @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct t > this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); > last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid); > > + /* > + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled > + * and the pages are on the lower tier. > + */ > if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) && > - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid)) > + !node_is_toptier(src_nid)) > return false; > > /* > _ --- Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled 2026-03-23 9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom 2026-04-02 0:22 ` Andrew Morton @ 2026-04-02 3:27 ` Huang, Ying 2026-04-02 4:59 ` Donet Tom 1 sibling, 1 reply; 6+ messages in thread From: Huang, Ying @ 2026-04-02 3:27 UTC (permalink / raw) To: Donet Tom Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra, Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli, Mel Gorman Donet Tom <donettom@linux.ibm.com> writes: > In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is > disabled and the pages are on the lower tier, the pages may still be > promoted. > > This happens because task_numa_work() updates the last_cpupid field to > record the last access time only when NUMA_BALANCING_MEMORY_TIERING is > enabled and the folio is on the lower tier. If > NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field > can retains a valid last CPU id. > > In should_numa_migrate_memory(), the decision checks whether > NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower > tier, and last_cpupid is invalid. However, the last_cpupid can be > valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition > evaluates to false and migration is allowed. > > This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is > disabled and the folio is on the lower tier. > > Behavior before this change: > ============================ > - If NUMA_BALANCING_NORMAL is enabled, migration occurs between > nodes within the same memory tier, and promotion from lower > tier to higher tier may also happen. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from > lower tier to higher tier nodes is allowed. > > Behavior after this change: > =========================== > - If NUMA_BALANCING_NORMAL is enabled, migration will occur only > between nodes within the same memory tier. > > - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower > tier to higher tier nodes will be allowed. > > - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are > enabled, both migration (same tier) and promotion (cross tier) are > allowed. > > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > Signed-off-by: Donet Tom <donettom@linux.ibm.com> > --- > v1 -> v2 > ======== > 1. Dropped changes in task_numa_fault() since the original changes > already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING. > > v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/ > --- > kernel/sched/fair.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index bf948db905ed..4b43809a3fb1 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio, > this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); > last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid); > > + /* > + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled > + * and the pages are on the lower tier. > + */ > if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) && > - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid)) > + !node_is_toptier(src_nid)) > return false; > > /* No. Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still allow migrate pages from lower tier to higher tier via NUMA_BALANCING_NORMAL. If we have precious DDR, why waste it? This follows the semantics of NUMA_BALANCING_NORMAL before introducing NUMA_BALANCING_MEMORY_TIERING. --- Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled 2026-04-02 3:27 ` Huang, Ying @ 2026-04-02 4:59 ` Donet Tom 2026-04-02 6:24 ` Huang, Ying 0 siblings, 1 reply; 6+ messages in thread From: Donet Tom @ 2026-04-02 4:59 UTC (permalink / raw) To: Huang, Ying Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra, Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli, Mel Gorman Hi On 4/2/26 8:57 AM, Huang, Ying wrote: > Donet Tom <donettom@linux.ibm.com> writes: > >> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is >> disabled and the pages are on the lower tier, the pages may still be >> promoted. >> >> This happens because task_numa_work() updates the last_cpupid field to >> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is >> enabled and the folio is on the lower tier. If >> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field >> can retains a valid last CPU id. >> >> In should_numa_migrate_memory(), the decision checks whether >> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower >> tier, and last_cpupid is invalid. However, the last_cpupid can be >> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition >> evaluates to false and migration is allowed. >> >> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is >> disabled and the folio is on the lower tier. >> >> Behavior before this change: >> ============================ >> - If NUMA_BALANCING_NORMAL is enabled, migration occurs between >> nodes within the same memory tier, and promotion from lower >> tier to higher tier may also happen. >> >> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from >> lower tier to higher tier nodes is allowed. >> >> Behavior after this change: >> =========================== >> - If NUMA_BALANCING_NORMAL is enabled, migration will occur only >> between nodes within the same memory tier. >> >> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower >> tier to higher tier nodes will be allowed. >> >> - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are >> enabled, both migration (same tier) and promotion (cross tier) are >> allowed. >> >> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") >> Signed-off-by: Donet Tom <donettom@linux.ibm.com> >> --- >> v1 -> v2 >> ======== >> 1. Dropped changes in task_numa_fault() since the original changes >> already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING. >> >> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/ >> --- >> kernel/sched/fair.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index bf948db905ed..4b43809a3fb1 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio, >> this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); >> last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid); >> >> + /* >> + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled >> + * and the pages are on the lower tier. >> + */ >> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) && >> - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid)) >> + !node_is_toptier(src_nid)) >> return false; >> >> /* > No. Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still > allow migrate pages from lower tier to higher tier via > NUMA_BALANCING_NORMAL. If we have precious DDR, why waste it? This > follows the semantics of NUMA_BALANCING_NORMAL before introducing > NUMA_BALANCING_MEMORY_TIERING. Thank you for the review comments. One thing I am trying to understand is that page promotion appears to happen regardless of whether NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that case, what is the specific role of NUMA_BALANCING_MEMORY_TIERING? Do we get better performance when it is enabled? My initial understanding was that disabling NUMA_BALANCING_MEMORY_TIERING could be used to turn off promotion. However, it seems that currently we cannot control promotion independently. If NUMA_BALANCING_NORMAL is disabled, neither migration nor promotion happens, and if it is enabled, both migration and promotion can occur. I was under the impression that: - NUMA_BALANCING_NORMAL would handle migration within the same tier, - NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers, - and enabling both would allow both migration and promotion. This would provide more fine-grained control. Is my understanding correct, or am I missing something here? > > --- > Best Regards, > Huang, Ying ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled 2026-04-02 4:59 ` Donet Tom @ 2026-04-02 6:24 ` Huang, Ying 0 siblings, 0 replies; 6+ messages in thread From: Huang, Ying @ 2026-04-02 6:24 UTC (permalink / raw) To: Donet Tom Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra, Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli, Mel Gorman Donet Tom <donettom@linux.ibm.com> writes: > Hi Hi, Donet, > On 4/2/26 8:57 AM, Huang, Ying wrote: >> Donet Tom <donettom@linux.ibm.com> writes: >> >>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is >>> disabled and the pages are on the lower tier, the pages may still be >>> promoted. >>> >>> This happens because task_numa_work() updates the last_cpupid field to >>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is >>> enabled and the folio is on the lower tier. If >>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field >>> can retains a valid last CPU id. >>> >>> In should_numa_migrate_memory(), the decision checks whether >>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower >>> tier, and last_cpupid is invalid. However, the last_cpupid can be >>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition >>> evaluates to false and migration is allowed. >>> >>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is >>> disabled and the folio is on the lower tier. >>> >>> Behavior before this change: >>> ============================ >>> - If NUMA_BALANCING_NORMAL is enabled, migration occurs between >>> nodes within the same memory tier, and promotion from lower >>> tier to higher tier may also happen. >>> >>> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from >>> lower tier to higher tier nodes is allowed. >>> >>> Behavior after this change: >>> =========================== >>> - If NUMA_BALANCING_NORMAL is enabled, migration will occur only >>> between nodes within the same memory tier. >>> >>> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower >>> tier to higher tier nodes will be allowed. >>> >>> - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are >>> enabled, both migration (same tier) and promotion (cross tier) are >>> allowed. >>> >>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") >>> Signed-off-by: Donet Tom <donettom@linux.ibm.com> >>> --- >>> v1 -> v2 >>> ======== >>> 1. Dropped changes in task_numa_fault() since the original changes >>> already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING. >>> >>> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/ >>> --- >>> kernel/sched/fair.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index bf948db905ed..4b43809a3fb1 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio, >>> this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); >>> last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid); >>> + /* >>> + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled >>> + * and the pages are on the lower tier. >>> + */ >>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) && >>> - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid)) >>> + !node_is_toptier(src_nid)) >>> return false; >>> /* >> No. Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still >> allow migrate pages from lower tier to higher tier via >> NUMA_BALANCING_NORMAL. If we have precious DDR, why waste it? This >> follows the semantics of NUMA_BALANCING_NORMAL before introducing >> NUMA_BALANCING_MEMORY_TIERING. > > Thank you for the review comments. > > One thing I am trying to understand is that page promotion > appears to happen regardless of whether > NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that > case, what is the specific role of > NUMA_BALANCING_MEMORY_TIERING? Do we get better performance > when it is enabled? You can search NUMA_BALANCING_MEMORY_TIERING to find out what it does. We can get better performance as the original commit message says. When NUMA_BALANCING_MEMORY_TIERING is introduced, we didn't change the original behavior of NUMA_BALANCING_MEMORY_NORMAL because we had no good reason to do that. In fact, you change its behavior, so you should provide some supporting data or bug report to justify the change. > My initial understanding was that disabling > NUMA_BALANCING_MEMORY_TIERING could be used to turn off > promotion. However, it seems that currently we cannot control > promotion independently. If NUMA_BALANCING_NORMAL is disabled, > neither migration nor promotion happens, and if it is enabled, > both migration and promotion can occur. > > I was under the impression that: > - NUMA_BALANCING_NORMAL would handle migration within the same tier, > - NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers, > - and enabling both would allow both migration and promotion. > > This would provide more fine-grained control. Is my > understanding correct, or am I missing something here? You can change this, if you have some supporting data or bug report. --- Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-02 6:24 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-23 9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom 2026-04-02 0:22 ` Andrew Morton 2026-04-02 3:31 ` Huang, Ying 2026-04-02 3:27 ` Huang, Ying 2026-04-02 4:59 ` Donet Tom 2026-04-02 6:24 ` Huang, Ying
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox