From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: Gregory Price <gourry@gourry.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>, Zi Yan <ziy@nvidia.com>,
Matthew Brost <matthew.brost@intel.com>,
Joshua Hahn <joshua.hahnjy@gmail.com>,
Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
Alistair Popple <apopple@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Neha Gholkar <nehagholkar@gmail.com>
Subject: Re: [PATCH] mm: mempolicy: fix automatic numa balancing for shmem
Date: Wed, 01 Jul 2026 19:03:32 +0800 [thread overview]
Message-ID: <877bnfynsr.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <akPg4ANRetbjSP_b@gourry-fedora-PF4VCD3F> (Gregory Price's message of "Tue, 30 Jun 2026 11:29:36 -0400")
Gregory Price <gourry@gourry.net> writes:
> On Tue, Jun 30, 2026 at 07:20:50PM +0800, Huang, Ying wrote:
>> Gregory Price <gourry@gourry.net> writes:
>>
>> [snip]
>>
>> > Demotions don't care about mempolicy, so opting shmem out of NUMA
>> > balancing and mbind'ing on a tiered system is just full sadness.
>> >
>> > This is all just more evidence that demotion needs to be completely
>> > redone, it's creating a mess of undefined behavior for memory placement.
>>
>> It's hard to respect mempolicy during demotion in the current
>> implementation. Do you have any ideas on how to improve this?
>>
>
> I think it's feasible we could respect per-vma mempolicies, but not
> per-task. That would at least make this particular interaction less
> painful and mbind() would do what you'd expect. It is a bit racy,
> but with MPOL_MF_MOVE_ALL the user can get what they actually want.
Yes. Per-vma mempolicy support is possible.
> I think task-wide mempolicy is problematic and generally a bad idea
> on tiered systems, maybe it's ok if we simply document task policies
> are not respected on tiered systems?
Anyway, it's convenient to use numactl to manage mempolicy.
Is it possible to enable NUMA_BALANCING_MEMORY_TIERING for non-default
VMAs? If we don't enable NUMA_BALANCING_NORMAL, the overhead should be
OK because the page table entries are changed to PROTN_ONE only for
pages on the slow tier.
Additionally, we may need to consider cpusets.
---
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2026-07-01 11:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-29 16:33 [PATCH] mm: mempolicy: fix automatic numa balancing for shmem Johannes Weiner
2026-06-29 17:59 ` Gregory Price
2026-06-29 18:22 ` Johannes Weiner
2026-06-30 11:20 ` Huang, Ying
2026-06-30 15:29 ` Gregory Price
2026-07-01 11:03 ` Huang, Ying [this message]
2026-07-01 15:33 ` Gregory Price
2026-07-01 15:49 ` Johannes Weiner
2026-07-01 16:22 ` Gregory Price
2026-06-29 18:33 ` David Hildenbrand (Arm)
2026-06-29 18:47 ` Johannes Weiner
2026-06-30 11:26 ` David Hildenbrand (Arm)
2026-06-30 23:40 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877bnfynsr.fsf@DESKTOP-5N7EMDA \
--to=ying.huang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=byungchul@sk.com \
--cc=david@kernel.org \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.brost@intel.com \
--cc=nehagholkar@gmail.com \
--cc=rakie.kim@sk.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.