All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Gregory Price <gourry@gourry.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Cui Chao <cuichao1753@phytium.com.cn>, <dan.j.williams@intel.com>,
	Mike Rapoport <rppt@kernel.org>,
	Wang Yinfeng <wangyinfeng@phytium.com.cn>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <qemu-devel@nongnu.org>,
	"David Hildenbrand (Arm)" <david@kernel.org>
Subject: Re: [PATCH v2 1/1] mm: numa_memblks: Identify the accurate NUMA ID of CFMW
Date: Fri, 6 Feb 2026 16:26:44 +0000	[thread overview]
Message-ID: <20260206162644.000050fe@huawei.com> (raw)
In-Reply-To: <aYYOZ1TK5dpX_h_Q@gourry-fedora-PF4VCD3F>

On Fri, 6 Feb 2026 10:53:11 -0500
Gregory Price <gourry@gourry.net> wrote:

> On Fri, Feb 06, 2026 at 03:09:41PM +0000, Jonathan Cameron wrote:
> > On Fri, 6 Feb 2026 08:31:09 -0500
> > Gregory Price <gourry@gourry.net> wrote:
> > 
> > Now a fun corner is that a node isn't created unless there is something
> > in it - the whole SRAT is the source of truth for what nodes exist
> > - so we need 'something' in it - a cpu will do, or a GI, probably a GP.
> > Otherwise memory ends up in node0.  However, fallback lists etc happen
> > as normal when first mem in a node is added.
> >   
> ...
> > For now I 'suspect' we could hack things to provide lots of waiting numa nodes
> > and merrily assign HPA into them as we like whatever SRAT provides
> > in the way of 'hints' :) 
> >   
> 
> look at ACPI MSCT - "Maximum Proximity Domain Information Structure" ;]
> 
> I don't remember reading anything in the ACPI spec that says something
> has to be ON any of these PXMs for it to be accounted for in the MSCT.
> 
> Platforms can just say "Reserve that many Nodes".
> 
> (Linux does not read this value, and on my existing systems, this number
> always reflects the number of actually present PXMs)
> 
> ---
> 
> We probably want to ignore that and just add this:
> 
> CONFIG_ACPI_NUMA_NODES_PER_CFMWS
>     int
>     range 1 4
>     help
>         This option determines the number of NUMA nodes that will be
> 	added for each CEDT CFMWS entry.
> 
> 	By default ACPI reserves 1 per unique PXM entry in the SRAT,
> 	or 1 for a CXL Fixed Memory Window without SRAT mappings.
> 
> 	This will reserve up to N nodes per CEDT entry, even if that
> 	CEDT has one or more SRAT entries.
> 
> then in the acpi/numa/srat.c code that parses srat/cedt, just track
> the number of nodes over a CEDT range.
> 
> for each srat:
>    account_unique_pxm(pxm, srat_range)
> 
> for each cedt:
>    nr_nodes = unique_pxms(cedt_range)
>    while (nr_nodes < CONFIG_ACPI_NUMA_NODES_PER_CFMWS)
>       node = acpi_map_pxm_to_node(*fake_pxm++);
>       if (node == NUMA_NO_NODE):
>       	err("Unable to reserve additional nodes for CXL windows")
> 	break;
>       node_set(node, numa_nodes_parsed);
>       nr_nodes++
> 
> This should fall out cleanly.
> 
> The additional nodes won't be associated with anything, but could be
> used for hotplug - I imagine.
> 

That aligns with what I was thinking as a first solution to allowing this
to be more dynamic.   We can get clever later if this doesn't prove sufficient.

Jonathan

> ~Gregory


WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron via qemu development <qemu-devel@nongnu.org>
To: Gregory Price <gourry@gourry.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Cui Chao <cuichao1753@phytium.com.cn>, <dan.j.williams@intel.com>,
	Mike Rapoport <rppt@kernel.org>,
	Wang Yinfeng <wangyinfeng@phytium.com.cn>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <qemu-devel@nongnu.org>,
	"David Hildenbrand (Arm)" <david@kernel.org>
Subject: Re: [PATCH v2 1/1] mm: numa_memblks: Identify the accurate NUMA ID of CFMW
Date: Fri, 6 Feb 2026 16:26:44 +0000	[thread overview]
Message-ID: <20260206162644.000050fe@huawei.com> (raw)
In-Reply-To: <aYYOZ1TK5dpX_h_Q@gourry-fedora-PF4VCD3F>

On Fri, 6 Feb 2026 10:53:11 -0500
Gregory Price <gourry@gourry.net> wrote:

> On Fri, Feb 06, 2026 at 03:09:41PM +0000, Jonathan Cameron wrote:
> > On Fri, 6 Feb 2026 08:31:09 -0500
> > Gregory Price <gourry@gourry.net> wrote:
> > 
> > Now a fun corner is that a node isn't created unless there is something
> > in it - the whole SRAT is the source of truth for what nodes exist
> > - so we need 'something' in it - a cpu will do, or a GI, probably a GP.
> > Otherwise memory ends up in node0.  However, fallback lists etc happen
> > as normal when first mem in a node is added.
> >   
> ...
> > For now I 'suspect' we could hack things to provide lots of waiting numa nodes
> > and merrily assign HPA into them as we like whatever SRAT provides
> > in the way of 'hints' :) 
> >   
> 
> look at ACPI MSCT - "Maximum Proximity Domain Information Structure" ;]
> 
> I don't remember reading anything in the ACPI spec that says something
> has to be ON any of these PXMs for it to be accounted for in the MSCT.
> 
> Platforms can just say "Reserve that many Nodes".
> 
> (Linux does not read this value, and on my existing systems, this number
> always reflects the number of actually present PXMs)
> 
> ---
> 
> We probably want to ignore that and just add this:
> 
> CONFIG_ACPI_NUMA_NODES_PER_CFMWS
>     int
>     range 1 4
>     help
>         This option determines the number of NUMA nodes that will be
> 	added for each CEDT CFMWS entry.
> 
> 	By default ACPI reserves 1 per unique PXM entry in the SRAT,
> 	or 1 for a CXL Fixed Memory Window without SRAT mappings.
> 
> 	This will reserve up to N nodes per CEDT entry, even if that
> 	CEDT has one or more SRAT entries.
> 
> then in the acpi/numa/srat.c code that parses srat/cedt, just track
> the number of nodes over a CEDT range.
> 
> for each srat:
>    account_unique_pxm(pxm, srat_range)
> 
> for each cedt:
>    nr_nodes = unique_pxms(cedt_range)
>    while (nr_nodes < CONFIG_ACPI_NUMA_NODES_PER_CFMWS)
>       node = acpi_map_pxm_to_node(*fake_pxm++);
>       if (node == NUMA_NO_NODE):
>       	err("Unable to reserve additional nodes for CXL windows")
> 	break;
>       node_set(node, numa_nodes_parsed);
>       nr_nodes++
> 
> This should fall out cleanly.
> 
> The additional nodes won't be associated with anything, but could be
> used for hotplug - I imagine.
> 

That aligns with what I was thinking as a first solution to allowing this
to be more dynamic.   We can get clever later if this doesn't prove sufficient.

Jonathan

> ~Gregory



  reply	other threads:[~2026-02-06 16:26 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06  3:10 [PATCH v2 0/1] Identify the accurate NUMA ID of CFMW Cui Chao
2026-01-06  3:10 ` [PATCH v2 1/1] mm: numa_memblks: " Cui Chao
2026-01-08 16:19   ` Jonathan Cameron
2026-01-08 17:48   ` Andrew Morton
2026-01-15  9:43     ` Cui Chao
2026-01-15 18:18       ` Andrew Morton
2026-01-15 19:50         ` dan.j.williams
2026-01-22  8:03           ` Cui Chao
2026-01-22 21:28             ` Andrew Morton
2026-01-23  8:59               ` Cui Chao
2026-01-23 16:46             ` Gregory Price
2026-01-26  9:06               ` Cui Chao
2026-02-05 22:58                 ` Andrew Morton
2026-02-05 23:10                   ` Gregory Price
2026-02-06 11:03                     ` Jonathan Cameron
2026-02-06 11:03                       ` Jonathan Cameron via qemu development
2026-02-06 13:31                       ` Gregory Price
2026-02-06 15:09                         ` Jonathan Cameron
2026-02-06 15:09                           ` Jonathan Cameron via qemu development
2026-02-06 15:53                           ` Gregory Price
2026-02-06 16:26                             ` Jonathan Cameron [this message]
2026-02-06 16:26                               ` Jonathan Cameron via qemu development
2026-02-06 16:32                               ` Gregory Price
2026-02-19 14:19                                 ` Jonathan Cameron
2026-02-19 14:19                                   ` Jonathan Cameron via qemu development
2026-02-06 15:57                           ` Andrew Morton
2026-02-06 16:23                             ` Jonathan Cameron
2026-02-06 16:23                               ` Jonathan Cameron via qemu development
2026-01-09  9:35   ` Pratyush Brahma
2026-01-15 10:06     ` Cui Chao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260206162644.000050fe@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=cuichao1753@phytium.com.cn \
    --cc=dan.j.williams@intel.com \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rppt@kernel.org \
    --cc=wangyinfeng@phytium.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.