All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: Jinjiang Tu <tujinjiang@huawei.com>
Cc: <akpm@linuxfoundation.org>,  <david@kernel.org>,
	 <ziy@nvidia.com>, <matthew.brost@intel.com>,
	 <joshua.hahnjy@gmail.com>, <rakie.kim@sk.com>,
	 <byungchul@sk.com>,  <gourry@gourry.net>, <apopple@nvidia.com>,
	 <mgorman@suse.de>,  <linux-mm@kvack.org>,
	<wangkefeng.wang@huawei.com>
Subject: Re: [PATCH v2] mm/mempolicy: fix mpol_rebind_nodemask() for MPOL_F_NUMA_BALANCING
Date: Tue, 23 Dec 2025 08:50:36 +0800	[thread overview]
Message-ID: <877bue9g2r.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <4cf67d5a-af50-44d4-8a2a-c7fc76b304ee@huawei.com> (Jinjiang Tu's message of "Mon, 22 Dec 2025 22:25:44 +0800")

Jinjiang Tu <tujinjiang@huawei.com> writes:

> 在 2025/12/22 17:51, Huang, Ying 写道:
>> Hi, Jinjiang,
>>
>> Sorry, I found the patch description is still confusing for me.
>>
>> Jinjiang Tu <tujinjiang@huawei.com> writes:
>>
>>> commit bda420b98505 ("numa balancing: migrate on fault among multiple
>>> bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing
>>> for MPOL_BIND memory policy.
>> Is the following description better?  At least, I think we should
>> emphasize that MPOL_F_NUMA_BALANCING is set while both
>> MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES are cleared in the mode
>> parameter.
>
> Thanks, I will update it to make it clearer. How about the following
> description?
>
>
> commit bda420b98505 ("numa balancing: migrate on fault among multiple
> bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing
> for MPOL_BIND memory policy.
>
> When the cpuset of tasks changes, the mempolicy of the task is rebound
> by mpol_rebind_nodemask(). When MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES
> are both not set, the behaviour is same whenever MPOL_F_NUMA_BALANCING

s/is/should be/

> is set or not. So, when an application calls set_mempolicy() with MPOL_F_NUMA_BALANCING
> set but both MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES cleared,
> mempolicy.w.cpuset_mems_allowed should be set to cpuset_current_mems_allowed nodemask.
> However, in current implementation, mpol_store_user_nodemask() wrongly returns true,
> causing mempolicy->w.user_nodemask to be incorrectly set to the user-specified nodemask.
> Later, when the cpuset of the application changes, mpol_rebind_nodemask() ends up rebinding
> based on the user-specified nodemask rather than the cpuset_mems_allowed
> nodemask as intended.
>
> To fix this, only set mempolicy->w.user_nodemask to the user-specified nodemask
> if MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES is present.

This looks good to me.  Thanks!  Feel free to add my

Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>

in the future versions.

>>
>> When an application calls set_mempolicy() with MPOL_F_NUMA_BALANCING set
>> but both MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES cleared,
>> mempolicy.w.cpuset_mems_allowed should be set to
>> cpuset_current_mems_allowed nodemask.  However, due to a bug in its
>> current implementation, mpol_store_user_nodemask() wrongly returns true,
>> causing mempolicy->w.user_nodemask to be incorrectly set to the
>> user-specified nodemask (or an empty nodemask).  Later, when the cpuset
>> of the application changes, mpol_rebind_nodemask() ends up rebinding
>> based on the user-specified nodemask rather than the cpuset_mems_allowed
>> nodemask as intended.
>>
>>> when the cpuset of tasks changes, the mempolicy of the task is rebound
>>> by mpol_rebind_nodemask(). The intended rebinding behavior of
>>> MPOL_F_NUMA_BALANCING was the same as when neither MPOL_F_STATIC_NODES nor
>>> MPOL_F_RELATIVE_NODES flags are set. However, this commit breaks it.
>>>
>>> struct mempolicy has a union member as bellow:
>>>
>>>     union {
>>>         nodemask_t cpuset_mems_allowed; /* relative to these nodes */
>>>         nodemask_t user_nodemask;       /* nodemask passed by user */
>>>     } w;
>>>
>>> w.cpuset_mems_allowed and w.user_nodemask are both nodemask type and their
>>> difference is only what type of nodemask is stored. mpol_set_nodemask()
>>> initializes the union like below:
>>>
>>>     static int mpol_set_nodemask(...)
>>>     {
>>>          if (mpol_store_user_nodemask(pol))
>>>                  pol->w.user_nodemask = *nodes;
>>>          else
>>>                  pol->w.cpuset_mems_allowed = cpuset_current_mems_allowed;
>>>     }
>>>
>>> mpol_store_user_nodemask() returns true for MPOL_F_NUMA_BALANCING
>>> incorrectly and the union stores user-passed nodemask. Consequently,
>>> mpol_rebind_nodemask() ends up rebinding based on the user-passed nodemask
>>> rather than the cpuset_mems_allowed nodemask as intended.
>>>
>>> To fix this, only store the user nodemask if MPOL_F_STATIC_NODES or
>>> MPOL_F_RELATIVE_NODES is present.
>>>
>>> Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes")
>>> Reviewed-by: Gregory Price <gourry@gourry.net>
>>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
>> [snip]
>>

---
Best Regards,
Huang, Ying


      reply	other threads:[~2025-12-23  0:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-22  3:04 [PATCH v2] mm/mempolicy: fix mpol_rebind_nodemask() for MPOL_F_NUMA_BALANCING Jinjiang Tu
2025-12-22  9:51 ` Huang, Ying
2025-12-22 14:25   ` Jinjiang Tu
2025-12-23  0:50     ` Huang, Ying [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877bue9g2r.fsf@DESKTOP-5N7EMDA \
    --to=ying.huang@linux.alibaba.com \
    --cc=akpm@linuxfoundation.org \
    --cc=apopple@nvidia.com \
    --cc=byungchul@sk.com \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=mgorman@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=tujinjiang@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.