From: Bing Jiao <bingjiao@google.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Donet Tom <donettom@linux.ibm.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 1/2] mm/vmscan: balance demotion allocation in alloc_demote_folio()
Date: Mon, 12 Jan 2026 19:23:36 +0000 [thread overview]
Message-ID: <aWVKJta4vuZEOIZV@google.com> (raw)
In-Reply-To: <20260110005229.1348817-1-joshua.hahnjy@gmail.com>
On Fri, Jan 09, 2026 at 04:52:28PM -0800, Joshua Hahn wrote:
> On Fri, 9 Jan 2026 23:45:57 +0000 Bing Jiao <bingjiao@google.com> wrote:
>
> > On Thu, Jan 08, 2026 at 06:14:02PM +0530, Donet Tom wrote:
> > >
> > > On 1/7/26 12:58 PM, Bing Jiao wrote:
> > > > + /* Randomly select a node from fallback nodes for balanced allocation */
> > > > + if (allowed_mask) {
> > > > + mtc->nid = node_random(allowed_mask);
> > >
> > >
> > > This random selection can cause allocations to fall back to distant memory
> > > even when the nearer demotion target has sufficient free memory, correct?
> > > Could this also lead to increased promotion latency?
> >
> > Hi Donet,
> >
> > Thanks for your questions.
> >
> > Yes, the random selection could select a distant node and lead to
> > incresed promotion latency.
> >
> > I just realized that the the fallback allocation should not weighted
> > by a single metric, such as node distance, capacity, free space.
>
> Hello Bing, I hope you are doing well!
>
> Yes -- this is also what I believe, and I think this idea of "how should we
> select demotion / allocation targets" is something that is a difficult problem
> (and one that may not have a single solution that "just works").
>
> It's also a question that I have been thinking about, and what was discussed
> in part at LSFMMBPF last year. At the time, I made some auto-tuning weights [1]
> for weighted interleave based on bandwidth capacity, since the main benefit of
> weighted interleave is to distribute memory accesses across multiple nodes
> to maximize how much bandwidth the system can use at once. A follow-up was to
> think about how these weights could change over time, and what heuristics
> should be used to determine how the weights are selected.
>
> Ultimately, we agreed that the heuristics should probably be delegated to
> userspace, since there are just so many scenarios that could change what
> metrics should take priority. (Jonathan Corbet wrote a great summary of the
> discussion in an LWN article [2])
>
> Coming back to this patchset, I think that all of the ideas above apply
> nicely here as well. What nodes should be selected for demotion and how they
> should be weighted is a difficult question, and one that is probably best
> answered by userspace and what workload they expect to use on their specific
> system.
>
> What I do believe though, is that an unweighted random selection / round-robin
> approach to selecting demotion targets might lead to some unexpected
> performance implications.
>
> > We need a thoroughly study before changing alloc_demote_folio().
>
> So I think this is the way to go : -)
> Although, I'm not actively exploring this at the moment ;)
>
> Please let me know what you think, I hope you have a great day!
> Joshua
>
> [1] https://lore.kernel.org/all/20250109185048.28587-1-joshua.hahnjy@gmail.com/
> [2] https://lwn.net/Articles/1016842/
Hi Joshua, hope you had a great weekend!
I appreciate you sharing that information. I really enjoyed reading these
articles and discussions.
It makes sense to assume users understand their requirements, but I think
the kernel needs internal heuristics for weight adjustment. Because
users often lack the comprehensive and immediate information necessary
to update their configration in a timely manner, unless the system has
an omniscient administrator who can oversee and (pre)allocate resource
for all tasks running on that system. Therefore, I think it is still
necessary to have kernel on weight adjustment.
I will think more about this and explore it further from userspace,
kernel space, or using a hybrid approach.
Thank you again for the sharing!
Best,
Bing
next prev parent reply other threads:[~2026-01-12 19:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 7:28 [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection Bing Jiao
2026-01-07 7:28 ` [PATCH v1 1/2] mm/vmscan: balance demotion allocation in alloc_demote_folio() Bing Jiao
2026-01-08 12:44 ` Donet Tom
2026-01-09 23:45 ` Bing Jiao
2026-01-10 0:52 ` Joshua Hahn
2026-01-12 19:23 ` Bing Jiao [this message]
2026-01-07 7:28 ` [PATCH v1 2/2] mm/vmscan: select the closest perferred node in demote_folio_list() Bing Jiao
2026-01-07 17:39 ` [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection Andrew Morton
2026-01-07 17:46 ` Joshua Hahn
2026-01-08 6:03 ` Bing Jiao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWVKJta4vuZEOIZV@google.com \
--to=bingjiao@google.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@kernel.org \
--cc=donettom@linux.ibm.com \
--cc=hannes@cmpxchg.org \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.