public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Honggyu Kim <honggyu.kim@sk.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>,
	gourry@gourry.net, harry.yoo@oracle.com,
	ying.huang@linux.alibaba.com
Cc: kernel_team@skhynix.com, gregkh@linuxfoundation.org,
	rakie.kim@sk.com, akpm@linux-foundation.org, rafael@kernel.org,
	lenb@kernel.org, dan.j.williams@intel.com,
	Jonathan.Cameron@huawei.com, dave.jiang@intel.com,
	horen.chuang@linux.dev, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-mm@kvack.org, kernel-team@meta.com, yunjeong.mun@sk.com
Subject: Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for memoryless nodes
Date: Thu, 27 Feb 2025 12:20:03 +0900	[thread overview]
Message-ID: <c43c64f4-e697-468e-80af-87bd02a95ba2@sk.com> (raw)
In-Reply-To: <b8ac8654-92bd-4c08-a3fc-e28a7be5e0e6@sk.com>



On 2/27/2025 11:32 AM, Honggyu Kim wrote:
> Hi Joshua,
> 
> On 2/27/2025 6:35 AM, Joshua Hahn wrote:
>> We should never try to allocate memory from a memoryless node. Creating a
>> sysfs knob to control its weighted interleave weight does not make sense,
>> and can be unsafe.
>>
>> Only create weighted interleave weight knobs for nodes with memory.
>>
>> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> ---
>>   mm/mempolicy.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 4cc04ff8f12c..50cbb7c047fa 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -3721,7 +3721,7 @@ static int add_weighted_interleave_group(struct 
>> kobject *root_kobj)
>>           return err;
>>       }
>> -    for_each_node_state(nid, N_POSSIBLE) {
> 
> Actually, we're aware of this issue and currently trying to fix this.
> In our system, we've attached 4ch of CXL memory for each socket as
> follows.
> 
>          node0             node1
>        +-------+   UPI   +-------+
>        | CPU 0 |-+-----+-| CPU 1 |
>        +-------+         +-------+
>        | DRAM0 |         | DRAM1 |
>        +---+---+         +---+---+
>            |                 |
>        +---+---+         +---+---+
>        | CXL 0 |         | CXL 4 |
>        +---+---+         +---+---+
>        | CXL 1 |         | CXL 5 |
>        +---+---+         +---+---+
>        | CXL 2 |         | CXL 6 |
>        +---+---+         +---+---+
>        | CXL 3 |         | CXL 7 |
>        +---+---+         +---+---+
>          node2             node3
> 
> The 4ch of CXL memory are detected as a single NUMA node in each socket,
> but it shows as follows with the current N_POSSIBLE loop.
> 
> $ ls /sys/kernel/mm/mempolicy/weighted_interleave/
> node0 node1 node2 node3 node4 node5
> node6 node7 node8 node9 node10 node11
> 
>> +    for_each_node_state(nid, N_MEMORY) {

Thinking it again, we can leave it as a separate patch but add our patch 
on top of it.

The only concern I have is having only N_MEMORY patch hides weight
setting knobs for CXL memory and it makes there is no way to set weight 
values to CXL memory in my system.

IMHO, this and our patch is better to be submitted together.

Thanks,
Honggyu

> 
> But using N_MEMORY doesn't fix this problem and it hides the entire CXL
> memory nodes in our system because the CXL memory isn't detected at this
> point of creating node*.  Maybe there is some difference when multiple
> CXL memory is detected as a single node.
> 
> We have to create more nodes when CXL memory is detected later.  In 
> addition, this part can be changed to "for_each_online_node(nid)"
> although N_MEMORY is also fine here.
> 
> We've internally fixed it using a memory hotpluging callback so we can
> upload another working version later.
> 
> Do you mind if we continue fixing this work?
> 
> Thanks,
> Honggyu
> 
>>           err = add_weight_node(nid, wi_kobj);
>>           if (err) {
>>               pr_err("failed to add sysfs [node%d]\n", nid);
> 



  reply	other threads:[~2025-02-27  3:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250228001631.1102-1-yunjeong.mun@sk.com>
2025-02-26 21:35 ` [PATCH 1/2 v6] mm/mempolicy: Weighted Interleave Auto-tuning Joshua Hahn
2025-02-26 21:35   ` [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for memoryless nodes Joshua Hahn
2025-02-27  2:32     ` Honggyu Kim
2025-02-27  3:20       ` Honggyu Kim [this message]
2025-03-03 21:56         ` Joshua Hahn
2025-03-04 12:53           ` Honggyu Kim
2025-03-03 16:19       ` Gregory Price
2025-03-04 13:03         ` Honggyu Kim
2025-03-04 16:16           ` Gregory Price
2025-03-04 16:29       ` Gregory Price
2025-03-06 12:39         ` Honggyu Kim
2025-03-06 17:32           ` Gregory Price
2025-03-07 11:46             ` Honggyu Kim
2025-03-07 17:51               ` Gregory Price
2025-03-10 12:26                 ` Honggyu Kim
2025-03-10 14:22                   ` Gregory Price
2025-03-11  2:07                     ` Yunjeong Mun
2025-03-11  2:42                       ` Gregory Price
2025-03-11  4:02                         ` Yunjeong Mun
2025-03-11  4:42                           ` Gregory Price
2025-03-11  9:51                             ` Yunjeong Mun
2025-03-11 15:52                               ` Gregory Price
2025-03-18  8:02                             ` Yunjeong Mun
2025-03-18 11:02                               ` Honggyu Kim
2025-03-18 15:13                                 ` Gregory Price
2025-03-19  9:56                                   ` Yunjeong Mun
2025-03-19 14:54                                     ` Gregory Price
2025-02-28  0:16   ` [PATCH 1/2 v6] mm/mempolicy: Weighted Interleave Auto-tuning yunjeong.mun
2025-02-28  6:39   ` Yunjeong Mun
2025-02-28 16:24     ` Joshua Hahn
2025-03-04 21:56     ` Joshua Hahn
2025-03-04 22:22       ` Joshua Hahn
2025-03-05  9:49         ` Yunjeong Mun
2025-03-05 16:28           ` Joshua Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c43c64f4-e697-468e-80af-87bd02a95ba2@sk.com \
    --to=honggyu.kim@sk.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gourry@gourry.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=horen.chuang@linux.dev \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kernel_team@skhynix.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=rakie.kim@sk.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yunjeong.mun@sk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox