From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: Bharata B Rao <bharata@amd.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<Jonathan.Cameron@huawei.com>, <dave.hansen@intel.com>,
<gourry@gourry.net>, <hannes@cmpxchg.org>,
<mgorman@techsingularity.net>, <mingo@redhat.com>,
<peterz@infradead.org>, <raghavendra.kt@amd.com>,
<riel@surriel.com>, <rientjes@google.com>, <sj@kernel.org>,
<weixugc@google.com>, <willy@infradead.org>, <ziy@nvidia.com>,
<dave@stgolabs.net>, <nifan.cxl@gmail.com>,
<joshua.hahnjy@gmail.com>, <xuezhengchu@huawei.com>,
<yiannis@zptcorp.com>, <akpm@linux-foundation.org>,
<david@redhat.com>
Subject: Re: [RFC PATCH v0 0/2] Batch migration for NUMA balancing
Date: Mon, 26 May 2025 16:46:36 +0800 [thread overview]
Message-ID: <87sekrbvyr.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <20250521080238.209678-1-bharata@amd.com> (Bharata B. Rao's message of "Wed, 21 May 2025 13:32:36 +0530")
Hi, Bharata,
Bharata B Rao <bharata@amd.com> writes:
> Hi,
>
> This is an attempt to convert the NUMA balancing to do batched
> migration instead of migrating one folio at a time. The basic
> idea is to collect (from hint fault handler) the folios to be
> migrated in a list and batch-migrate them from task_work context.
> More details about the specifics are present in patch 2/2.
>
> During LSFMM[1] and subsequent discussions in MM alignment calls[2],
> it was suggested that separate migration threads to handle migration
> or promotion request may be desirable. Existing NUMA balancing, hot
> page promotion and other future promotion techniques could off-load
> migration part to these threads.
What is the expected benefit of the change?
For code reuse, we can use migrate_misplaced_folio() or
migrate_misplaced_folio_batch() in various promotion path.
For workload latency influence, per my understanding, PTE scanning is
much more serious than migration. Why not start from that?
> Or if we manage to have a single
> source of hotness truth like kpromoted[3], then that too can hand
> over migration requests to the migration threads. I am envisaging
> that different hotness sources like kmmscand[4], MGLRU[5], IBS[6]
> and CXL HMU would push hot page info to kpromoted, which would
> then isolate and push the folios to be promoted to the migrator
> thread.
>
> As a first step, this is an attempt to batch and perform NUMAB
> migrations in async manner. Separate migration threads aren't
> yet implemented but I am using Gregory's patch[7] that provides
> migrate_misplaced_folio_batch() API to do batch migration of
> misplaced folios.
>
> Some points for discussion
> --------------------------
> 1. To isolate the misplaced folios or not?
>
> To do batch migration, the misplaced folios need to be stored in
> some manner. I thought isolating them and using the folio->lru
> field to link them up would be the most straight-forward way. But
> then there were concerns expressed about folios remaining isolated
> for long until they get migrated.
>
> Or should we just maintain the PFNs instead of folios and
> isolate them only just prior to migrating them?
>
> 2. Managing target_nid for misplaced pages
>
> NUMAB provides the accurate target_nid for each folio that is
> detected as misplaced. However when we don't migrate the folio
> right away, but instead want to batch and do asyn migration later,
> then where do we keep track of target_nid for each folio?
>
> In this implementation, I am using last_cpupid field as it appeared
> that this field could be reused (with some challenges mentioned
> in 2/2) for isolated folios. This approach may be specific to NUMAB
> but then each sub-system that hands over pages to the migrator thread
> should also provide a target_nid and hence each sub-system should be
> free to maintain and track the target_nid of folios that it has
> isolated/batched for migration in its own specific manner.
>
> 3. How many folios to batch?
>
> Currently I have a fixed threshold for number of folios to batch.
> It could be a sysctl to allow a setting between a min and max. It
> could also be auto-tuned if required.
>
> The state of the patchset
> -------------------------
> * Still raw and very lightly tested
> * Just posted to serve as base for subsequent discussions
> here and in MM alignment calls.
>
> References
> ----------
> [1] LSFMM LWN summary - https://lwn.net/Articles/1016519/
> [2] MM alignment call summary - https://lore.kernel.org/linux-mm/263d7140-c343-e82e-b836-ec85c52b54eb@google.com/
> [3] kpromoted patchset - https://lore.kernel.org/linux-mm/20250306054532.221138-1-bharata@amd.com/
> [4] Kmmscand: PTE A bit scanning - https://lore.kernel.org/linux-mm/20250319193028.29514-1-raghavendra.kt@amd.com/
> [5] MGLRU scanning for page promotion - https://lore.kernel.org/lkml/20250324220301.1273038-1-kinseyho@google.com/
> [6] IBS base hot page promotion - https://lore.kernel.org/linux-mm/20250306054532.221138-4-bharata@amd.com/
> [7] Unmapped page cache folio promotion patchset - https://lore.kernel.org/linux-mm/20250411221111.493193-1-gourry@gourry.net/
>
> Bharata B Rao (1):
> mm: sched: Batch-migrate misplaced pages
>
> Gregory Price (1):
> migrate: implement migrate_misplaced_folio_batch
>
> include/linux/migrate.h | 6 ++++
> include/linux/sched.h | 4 +++
> init/init_task.c | 2 ++
> kernel/sched/fair.c | 64 +++++++++++++++++++++++++++++++++++++++++
> mm/memory.c | 44 ++++++++++++++--------------
> mm/migrate.c | 31 ++++++++++++++++++++
> 6 files changed, 130 insertions(+), 21 deletions(-)
---
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2025-05-26 8:46 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-21 8:02 [RFC PATCH v0 0/2] Batch migration for NUMA balancing Bharata B Rao
2025-05-21 8:02 ` [RFC PATCH v0 1/2] migrate: implement migrate_misplaced_folio_batch Bharata B Rao
2025-05-22 15:59 ` David Hildenbrand
2025-05-22 16:03 ` Gregory Price
2025-05-22 16:08 ` David Hildenbrand
2025-05-26 8:16 ` Huang, Ying
2025-05-21 8:02 ` [RFC PATCH v0 2/2] mm: sched: Batch-migrate misplaced pages Bharata B Rao
2025-05-21 18:25 ` Donet Tom
2025-05-21 18:40 ` Zi Yan
2025-05-22 3:24 ` Gregory Price
2025-05-22 5:23 ` Bharata B Rao
2025-05-22 4:42 ` Bharata B Rao
2025-05-22 4:39 ` Bharata B Rao
2025-05-23 9:05 ` Donet Tom
2025-05-22 0:00 ` kernel test robot
2025-05-22 3:55 ` Gregory Price
2025-05-22 7:33 ` Bharata B Rao
2025-05-22 15:38 ` Gregory Price
2025-05-22 16:11 ` David Hildenbrand
2025-05-22 16:24 ` Zi Yan
2025-05-22 16:26 ` David Hildenbrand
2025-05-22 16:38 ` Zi Yan
2025-05-22 17:21 ` David Hildenbrand
2025-05-22 17:30 ` Zi Yan
2025-05-26 8:33 ` Huang, Ying
2025-05-26 9:29 ` David Hildenbrand
2025-05-26 14:20 ` Zi Yan
2025-05-27 1:18 ` Huang, Ying
2025-05-27 1:27 ` Zi Yan
2025-05-28 12:25 ` Karim Manaouil
2025-05-26 5:14 ` Bharata B Rao
2025-05-21 18:45 ` [RFC PATCH v0 0/2] Batch migration for NUMA balancing SeongJae Park
2025-05-22 3:08 ` Gregory Price
2025-05-22 16:30 ` SeongJae Park
2025-05-22 17:40 ` Gregory Price
2025-05-22 18:52 ` SeongJae Park
2025-05-22 18:43 ` Apologies and clarifications on DAMON-disruptions (was Re: [RFC PATCH v0 0/2] Batch migration for NUMA balancing) SeongJae Park
2025-05-26 5:20 ` [RFC PATCH v0 0/2] Batch migration for NUMA balancing Bharata B Rao
2025-05-27 18:50 ` SeongJae Park
2025-05-26 8:46 ` Huang, Ying [this message]
2025-05-27 8:53 ` Bharata B Rao
2025-05-27 9:05 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sekrbvyr.fsf@DESKTOP-5N7EMDA \
--to=ying.huang@linux.alibaba.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=bharata@amd.com \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=nifan.cxl@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=sj@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=xuezhengchu@huawei.com \
--cc=yiannis@zptcorp.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.