From: Gregory Price <gourry@gourry.net>
To: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Amir Goldstein <amir73il@gmail.com>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
lsf-pc <lsf-pc@lists.linux-foundation.org>,
Bharata B Rao <bharata@amd.com>,
Donet Tom <donettom@linux.ibm.com>,
Aboorva Devarajan <aboorvad@linux.ibm.com>,
linux-mm@kvack.org, Ojaswin Mujoo <ojaswin@linux.ibm.com>
Subject: Re: [LSF/MM/BPF BoF Session] Numa-Aware Placement for Page Cache Pages
Date: Mon, 4 May 2026 00:48:39 +0100 [thread overview]
Message-ID: <affe1wM7A8hWxgUW@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <8qa0sc06.ritesh.list@gmail.com>
On Sun, May 03, 2026 at 09:48:01PM +0530, Ritesh Harjani wrote:
> Gregory Price <gourry@gourry.net> writes:
>
> MADV_POPULATE_READ_NOIO should ensure that only the cached folios
> belonging to that file are mapped into the process address space w/o
> doing any extra disk I/Os. The subsequent mbind call with MPOL_MF_MOVE,
> will then ensure that all the existing mapped folios are migrated into the
> chosen numa node. And also that any new pages which gets faulted in will
> get allocated onto the chose numa node because of MPOL_BIND policy.
>
This all gets rather racy with buffered I/O, I'm not sure we can make
this work the way either one of us want. I need to chew on this.
> I believe there might be existing applications which might be facing
> this problem today. This can happen, for instance, when there is a
> workload which can run multiple times and may run across different NUMA
> nodes. Our internal test team once reported a similar performance
> regression with llama-bench on subsequent runs when running it across
> different NUMA nodes. The reason this happened was that the existing
> page cache folios of model weight file (from the previous run on a
> separate NUMA node) were not getting migrated (because they were not
> calling MADV_POPULATE_READ since it can cause a read of a large model
> weight file into the page cache all at once).
>
There's a long standing issue of unmapped page cache files getting
trapped on lower-tier memory, i think that's an orthoganol issue to the
discussion here.
IIRC DAMON can migrate them, i think, and vmscan.c will happily demote
these folios. So at a minimum we know they're "migratable".
> With that in mind, do we think having something like
> MADV_POPULATE_READ_NOIO make sense to address such problems? Do we have
> any other usecases of this too?
> Or do we see any problems with this, due to which it never existed?
>
> (Note that I haven't yet given a thought for how it should behave for
> anon memory).
>
I'm not sure why this would apply to anon? Unless the issue is
specifically anon MAP_SHARED.
~Gregory
next prev parent reply other threads:[~2026-05-03 23:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 11:33 [LSF/MM/BPF BoF Session] Numa-Aware Placement for Page Cache Pages Ritesh Harjani (IBM)
2026-04-30 13:15 ` Matthew Wilcox
2026-04-30 14:43 ` Ritesh Harjani
2026-05-02 14:57 ` Gregory Price
2026-05-02 15:49 ` Gregory Price
2026-05-03 16:18 ` Ritesh Harjani
2026-05-03 23:48 ` Gregory Price [this message]
2026-05-02 23:00 ` Matthew Wilcox
2026-05-03 14:15 ` Gregory Price
2026-04-30 17:32 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=affe1wM7A8hWxgUW@gourry-fedora-PF4VCD3F \
--to=gourry@gourry.net \
--cc=aboorvad@linux.ibm.com \
--cc=amir73il@gmail.com \
--cc=bharata@amd.com \
--cc=brauner@kernel.org \
--cc=donettom@linux.ibm.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox