public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
@ 2026-03-08 22:23 Kai Huang
  2026-03-09  0:06 ` Gregory Price
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Kai Huang @ 2026-03-08 22:23 UTC (permalink / raw)
  To: rafael, lenb, dan.j.williams, alison.schofield
  Cc: akpm, nunodasneves, xueshuai, thorsten.blum, gourry,
	wangyuquan1236, linux-acpi, linux-cxl, linux-kernel, Kai Huang

On CXL platforms, the Static Resource Affinity Table (SRAT) may not
cover memory affinity information for all the CXL memory regions.  Since
each CXL memory region is enumerated via a CXL Fixed Memory Window
Structure (CFMWS), during early boot the kernel parses the CFMWS tables
to find all CXL memory regions and sets a NUMA node for each of them.
This memory affinity information of CXL memory regions is later used by
the CXL ACPI driver.

The CFMWS table doesn't provide the memory affinity information either.
Currently the kernel assigns a 'faked' NUMA node for each CXL memory
region, starting from the next node of the highest node that is
enumerated via the SRAT.  This can potentially increase the maximum NUMA
node ID of the platform ('nr_node_ids') a lot.  E.g., on a GNR platform
with 4 NUMA nodes and 18 CFMWS tables, this bumps the 'nr_node_ids' to
22.

Increasing the 'nr_node_ids' has side effects.  For instance, it is
widely used by the kernel for "highest possible NUMA node" based memory
allocations.  It also impacts userspace ABIs, e.g., some NUMA memory
related system calls such as 'get_mempolicy' which requires 'maxnode'
not being smaller than the 'nr_node_ids'.

Currently parsing CFMWS tables and assigning faked NUMA node at boot is
done unconditionally.  However, if the CXL ACPI driver is not enabled,
there will be no user of such memory affinity information of CXL memory
regions.

Change to only parsing the CFMWS tables at boot when CXL_ACPI is enabled
in Kconfig to avoid the unnecessary cost of bumping up 'nr_node_ids'.

E.g., on the aforementioned GNR platform, the "Slab" in /proc/meminfo is
reduced with this change (when CXL_ACPI is off):

	w/ this change		w/o

Slab	900488 kB		923660 kB

Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v1 -> v2:

 - Use Dan's suggestion to simplify the diff:

 https://lore.kernel.org/linux-cxl/69a8dc7ca72c2_2f4a10026@dwillia2-mobl4.notmuch/

Hi Alison, Gregory,

I didn't add your RB since the code now is different from that you
reviewed.  Appreciate if you can take a look again and provide the tag
if the patch looks good to you.

---
 drivers/acpi/numa/srat.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
index aa87ee1583a4..62d4a8df0b8c 100644
--- a/drivers/acpi/numa/srat.c
+++ b/drivers/acpi/numa/srat.c
@@ -654,8 +654,11 @@ int __init acpi_numa_init(void)
 	}
 	last_real_pxm = fake_pxm;
 	fake_pxm++;
-	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_parse_cfmws,
-			      &fake_pxm);
+
+	/* No need to expand numa nodes if CXL is disabled */
+	if (IS_ENABLED(CONFIG_CXL_ACPI))
+		acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_parse_cfmws,
+				      &fake_pxm);
 
 	if (cnt < 0)
 		return cnt;

base-commit: 084f843093bee5563b179fd4b630122ba820e0c7
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
  2026-03-08 22:23 [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on Kai Huang
@ 2026-03-09  0:06 ` Gregory Price
  2026-03-09 11:52 ` Jonathan Cameron
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Gregory Price @ 2026-03-09  0:06 UTC (permalink / raw)
  To: Kai Huang
  Cc: rafael, lenb, dan.j.williams, alison.schofield, akpm,
	nunodasneves, xueshuai, thorsten.blum, wangyuquan1236, linux-acpi,
	linux-cxl, linux-kernel

On Mon, Mar 09, 2026 at 11:23:13AM +1300, Kai Huang wrote:
> On CXL platforms, the Static Resource Affinity Table (SRAT) may not
> cover memory affinity information for all the CXL memory regions.  Since
> each CXL memory region is enumerated via a CXL Fixed Memory Window
> Structure (CFMWS), during early boot the kernel parses the CFMWS tables
> to find all CXL memory regions and sets a NUMA node for each of them.
> This memory affinity information of CXL memory regions is later used by
> the CXL ACPI driver.
> 
... snip ...
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>

lgtm

Reviewed-by: Gregory Price <gourry@gourry.net>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
  2026-03-08 22:23 [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on Kai Huang
  2026-03-09  0:06 ` Gregory Price
@ 2026-03-09 11:52 ` Jonathan Cameron
  2026-03-09 17:13 ` Alison Schofield
  2026-03-16 18:11 ` Dave Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Jonathan Cameron @ 2026-03-09 11:52 UTC (permalink / raw)
  To: Kai Huang
  Cc: rafael, lenb, dan.j.williams, alison.schofield, akpm,
	nunodasneves, xueshuai, thorsten.blum, gourry, wangyuquan1236,
	linux-acpi, linux-cxl, linux-kernel

On Mon,  9 Mar 2026 11:23:13 +1300
Kai Huang <kai.huang@intel.com> wrote:

> On CXL platforms, the Static Resource Affinity Table (SRAT) may not
> cover memory affinity information for all the CXL memory regions.  Since
> each CXL memory region is enumerated via a CXL Fixed Memory Window
> Structure (CFMWS), during early boot the kernel parses the CFMWS tables
> to find all CXL memory regions and sets a NUMA node for each of them.
> This memory affinity information of CXL memory regions is later used by
> the CXL ACPI driver.
> 
> The CFMWS table doesn't provide the memory affinity information either.
> Currently the kernel assigns a 'faked' NUMA node for each CXL memory
> region, starting from the next node of the highest node that is
> enumerated via the SRAT.  This can potentially increase the maximum NUMA
> node ID of the platform ('nr_node_ids') a lot.  E.g., on a GNR platform
> with 4 NUMA nodes and 18 CFMWS tables, this bumps the 'nr_node_ids' to
> 22.
> 
> Increasing the 'nr_node_ids' has side effects.  For instance, it is
> widely used by the kernel for "highest possible NUMA node" based memory
> allocations.  It also impacts userspace ABIs, e.g., some NUMA memory
> related system calls such as 'get_mempolicy' which requires 'maxnode'
> not being smaller than the 'nr_node_ids'.
> 
> Currently parsing CFMWS tables and assigning faked NUMA node at boot is
> done unconditionally.  However, if the CXL ACPI driver is not enabled,
> there will be no user of such memory affinity information of CXL memory
> regions.
> 
> Change to only parsing the CFMWS tables at boot when CXL_ACPI is enabled
> in Kconfig to avoid the unnecessary cost of bumping up 'nr_node_ids'.
> 
> E.g., on the aforementioned GNR platform, the "Slab" in /proc/meminfo is
> reduced with this change (when CXL_ACPI is off):
> 
> 	w/ this change		w/o
> 
> Slab	900488 kB		923660 kB
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
Even without all the reasoning above, I'm keen to remove state that we
know is pointless!

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
  2026-03-08 22:23 [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on Kai Huang
  2026-03-09  0:06 ` Gregory Price
  2026-03-09 11:52 ` Jonathan Cameron
@ 2026-03-09 17:13 ` Alison Schofield
  2026-03-16 18:11 ` Dave Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Alison Schofield @ 2026-03-09 17:13 UTC (permalink / raw)
  To: Kai Huang
  Cc: rafael, lenb, dan.j.williams, akpm, nunodasneves, xueshuai,
	thorsten.blum, gourry, wangyuquan1236, linux-acpi, linux-cxl,
	linux-kernel

On Mon, Mar 09, 2026 at 11:23:13AM +1300, Kai Huang wrote:
> On CXL platforms, the Static Resource Affinity Table (SRAT) may not
> cover memory affinity information for all the CXL memory regions.  Since
> each CXL memory region is enumerated via a CXL Fixed Memory Window
> Structure (CFMWS), during early boot the kernel parses the CFMWS tables
> to find all CXL memory regions and sets a NUMA node for each of them.
> This memory affinity information of CXL memory regions is later used by
> the CXL ACPI driver.
> 
> The CFMWS table doesn't provide the memory affinity information either.
> Currently the kernel assigns a 'faked' NUMA node for each CXL memory
> region, starting from the next node of the highest node that is
> enumerated via the SRAT.  This can potentially increase the maximum NUMA
> node ID of the platform ('nr_node_ids') a lot.  E.g., on a GNR platform
> with 4 NUMA nodes and 18 CFMWS tables, this bumps the 'nr_node_ids' to
> 22.
> 
> Increasing the 'nr_node_ids' has side effects.  For instance, it is
> widely used by the kernel for "highest possible NUMA node" based memory
> allocations.  It also impacts userspace ABIs, e.g., some NUMA memory
> related system calls such as 'get_mempolicy' which requires 'maxnode'
> not being smaller than the 'nr_node_ids'.
> 
> Currently parsing CFMWS tables and assigning faked NUMA node at boot is
> done unconditionally.  However, if the CXL ACPI driver is not enabled,
> there will be no user of such memory affinity information of CXL memory
> regions.
> 
> Change to only parsing the CFMWS tables at boot when CXL_ACPI is enabled
> in Kconfig to avoid the unnecessary cost of bumping up 'nr_node_ids'.
> 
> E.g., on the aforementioned GNR platform, the "Slab" in /proc/meminfo is
> reduced with this change (when CXL_ACPI is off):
> 
> 	w/ this change		w/o
> 
> Slab	900488 kB		923660 kB
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> 
> v1 -> v2:
> 
>  - Use Dan's suggestion to simplify the diff:
> 
>  https://lore.kernel.org/linux-cxl/69a8dc7ca72c2_2f4a10026@dwillia2-mobl4.notmuch/
> 
> Hi Alison, Gregory,
> 
> I didn't add your RB since the code now is different from that you
> reviewed.  Appreciate if you can take a look again and provide the tag
> if the patch looks good to you.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
  2026-03-08 22:23 [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on Kai Huang
                   ` (2 preceding siblings ...)
  2026-03-09 17:13 ` Alison Schofield
@ 2026-03-16 18:11 ` Dave Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Dave Jiang @ 2026-03-16 18:11 UTC (permalink / raw)
  To: Kai Huang, rafael, lenb, dan.j.williams, alison.schofield
  Cc: akpm, nunodasneves, xueshuai, thorsten.blum, gourry,
	wangyuquan1236, linux-acpi, linux-cxl, linux-kernel



On 3/8/26 3:23 PM, Kai Huang wrote:
> On CXL platforms, the Static Resource Affinity Table (SRAT) may not
> cover memory affinity information for all the CXL memory regions.  Since
> each CXL memory region is enumerated via a CXL Fixed Memory Window
> Structure (CFMWS), during early boot the kernel parses the CFMWS tables
> to find all CXL memory regions and sets a NUMA node for each of them.
> This memory affinity information of CXL memory regions is later used by
> the CXL ACPI driver.
> 
> The CFMWS table doesn't provide the memory affinity information either.
> Currently the kernel assigns a 'faked' NUMA node for each CXL memory
> region, starting from the next node of the highest node that is
> enumerated via the SRAT.  This can potentially increase the maximum NUMA
> node ID of the platform ('nr_node_ids') a lot.  E.g., on a GNR platform
> with 4 NUMA nodes and 18 CFMWS tables, this bumps the 'nr_node_ids' to
> 22.
> 
> Increasing the 'nr_node_ids' has side effects.  For instance, it is
> widely used by the kernel for "highest possible NUMA node" based memory
> allocations.  It also impacts userspace ABIs, e.g., some NUMA memory
> related system calls such as 'get_mempolicy' which requires 'maxnode'
> not being smaller than the 'nr_node_ids'.
> 
> Currently parsing CFMWS tables and assigning faked NUMA node at boot is
> done unconditionally.  However, if the CXL ACPI driver is not enabled,
> there will be no user of such memory affinity information of CXL memory
> regions.
> 
> Change to only parsing the CFMWS tables at boot when CXL_ACPI is enabled
> in Kconfig to avoid the unnecessary cost of bumping up 'nr_node_ids'.
> 
> E.g., on the aforementioned GNR platform, the "Slab" in /proc/meminfo is
> reduced with this change (when CXL_ACPI is off):
> 
> 	w/ this change		w/o
> 
> Slab	900488 kB		923660 kB
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>

Applied to cxl/next
1e1cd49ded59 ("ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on")

> ---
> 
> v1 -> v2:
> 
>  - Use Dan's suggestion to simplify the diff:
> 
>  https://lore.kernel.org/linux-cxl/69a8dc7ca72c2_2f4a10026@dwillia2-mobl4.notmuch/
> 
> Hi Alison, Gregory,
> 
> I didn't add your RB since the code now is different from that you
> reviewed.  Appreciate if you can take a look again and provide the tag
> if the patch looks good to you.
> 
> ---
>  drivers/acpi/numa/srat.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> index aa87ee1583a4..62d4a8df0b8c 100644
> --- a/drivers/acpi/numa/srat.c
> +++ b/drivers/acpi/numa/srat.c
> @@ -654,8 +654,11 @@ int __init acpi_numa_init(void)
>  	}
>  	last_real_pxm = fake_pxm;
>  	fake_pxm++;
> -	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_parse_cfmws,
> -			      &fake_pxm);
> +
> +	/* No need to expand numa nodes if CXL is disabled */
> +	if (IS_ENABLED(CONFIG_CXL_ACPI))
> +		acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_parse_cfmws,
> +				      &fake_pxm);
>  
>  	if (cnt < 0)
>  		return cnt;
> 
> base-commit: 084f843093bee5563b179fd4b630122ba820e0c7


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-16 18:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-08 22:23 [PATCH v2] ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on Kai Huang
2026-03-09  0:06 ` Gregory Price
2026-03-09 11:52 ` Jonathan Cameron
2026-03-09 17:13 ` Alison Schofield
2026-03-16 18:11 ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox