Re: [PATCH v3] ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT

public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed

From: Alison Schofield <alison.schofield@intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Ben Widawsky <ben.widawsky@intel.com>,
	linux-cxl@vger.kernel.org,
	Linux ACPI <linux-acpi@vger.kernel.org>
Subject: Re: [PATCH v3] ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT
Date: Fri, 29 Oct 2021 15:49:14 -0700	[thread overview]
Message-ID: <20211029224914.GA500689@alison-desk> (raw)
In-Reply-To: <CAPcyv4hdC4Uj8YdePMZGFkxgP10VSkX1tiY+ApPctyjfURPSOg@mail.gmail.com>

On Mon, Oct 25, 2021 at 07:47:32PM -0700, Dan Williams wrote:
> On Mon, Oct 18, 2021 at 10:01 PM <alison.schofield@intel.com> wrote:
> >
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > During NUMA init, CXL memory defined in the SRAT Memory Affinity
> > subtable may be assigned to a NUMA node. Since there is no
> > requirement that the SRAT be comprehensive for CXL memory another
> > mechanism is needed to assign NUMA nodes to CXL memory not identified
> > in the SRAT.
> >
> > Use the CXL Fixed Memory Window Structure (CFMWS) of the ACPI CXL
> > Early Discovery Table (CEDT) to find all CXL memory ranges.
> > Create a NUMA node for each CFMWS that is not already assigned to
> > a NUMA node. Add a memblk attaching its host physical address
> > range to the node.
> >
> > Note that these ranges may not actually map any memory at boot time.
> > They may describe persistent capacity or may be present to enable
> > hot-plug.
> >
> > Consumers can use phys_to_target_node() to discover the NUMA node.
> >
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---

The next version of this patch is now included in this patchset that
adds helpers for parsing the CEDT subtables:
https://lore.kernel.org/linux-cxl/163553711933.2509508.2203471175679990.stgit@dwillia2-desk3.amr.corp.intel.com/T/#mf40d84e1ad4c01f69f794d591b07774255993185

It addresses Dan's comments below.

>> snip

> > +{
> > +       struct acpi_cedt_cfmws *cfmws;
> > +       acpi_size len, cur = 0;
> > +       int i, node, pxm = 0;
> 
> Shouldn't this be -1, on the idea that the first numa node to assign
> if none are set is zero?
> 
> I don't think the way you have it is a problem in practice because
> SRAT should always be there in a NUMA system. However, the first CFMWS
> pxm should start after the last SRAT entry, or 0 if no SRAT entries.
> 
> > +       void *cedt_subtable;
> > +       u64 start, end;
> > +
> > +       /* Find the max PXM defined in the SRAT */
> > +       for (i = 0; i < MAX_NUMNODES - 1; i++) {
> 
> How about:
> 
>     for (i = 0, pxm = -1; i < MAX_NUMNODES -1; i++)
> 
> ...just to keep the initialization close to the use, but that's just a
> personal style preference.

Done.

> 
> > +               if (node_to_pxm_map[i] > pxm)
> > +                       pxm = node_to_pxm_map[i];
> > +       }
> > +       /* Start assigning fake PXM values after the SRAT max PXM */
> > +       pxm++;
> > +
> > +       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > +       cedt_subtable = acpi_cedt + 1;
> > +
> > +       while (cur < len) {
> 
> Similarly to above I wonder if this would be cleaner as a for loop
> then you could use typical "continue" statements rather than goto. I
> took a stab at creating a for_each_cedt() helper which ended up a
> decent cleanup for drivers/cxl/
> 
>  drivers/cxl/acpi.c |   48 +++++++++++++++---------------------------------
>  1 file changed, 15 insertions(+), 33 deletions(-)
> 
> ...however, I just realized this NUMA code is running at init time, so
> it can just use the acpi_table_parse_entries_array() helper to walk
> the CFMWS like the othe subtable walkers in acpi_numa_init(). You
> would need to update the subtable helpers (acpi_get_subtable_type() et
> al) to recognize the CEDT case.
> 
> [ Side note for the implications of acpi_table_parse_entries_array()
> for drivers/cxl/acpi.c ]
> 
> Rafael, both the NFIT driver and now the CXL ACPI driver have open
> coded subtable parsing. Any philosophical reason to keep the subtable
> parsing code as __init? It can still be __init and thrown away if
> those drivers are not build-time enabled.
> 

The updated patch (in the greater patchset) now uses the new helpers.


> snip
> > +               node = acpi_map_pxm_to_node(pxm);
> > +               if (node == NUMA_NO_NODE) {
> > +                       pr_err("ACPI NUMA: Too many proximity domains.\n");
> 
> I would add "while processing CFMWS" to make it clear that the BIOS
> technically did not declare too many PXMs; it was the Linux heuristic
> for opportunistically emulating more PXMs.
> 

Done.

> > snip
> >         }
> > diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> > index 974d497a897d..f837fd715440 100644
> > --- a/include/linux/acpi.h
> > +++ b/include/linux/acpi.h
> > @@ -426,6 +426,7 @@ extern bool acpi_osi_is_win8(void);
> >  #ifdef CONFIG_ACPI_NUMA
> >  int acpi_map_pxm_to_node(int pxm);
> >  int acpi_get_node(acpi_handle handle);
> > +int __init numa_add_memblk(int nodeid, u64 start, u64 end);
> 
> This doesn't belong here.
> 
> There is already a declaration for this in
> arch/x86/include/asm/numa.h. I think what you are missing is that your
> new code needs to be within the same ifdef guards as the other helpers
> in srat.c that call numa_add_memblk(). See the line that has:
> 
> #if defined(CONFIG_X86) || defined(CONFIG_ARM64) || defined(CONFIG_LOONGARCH)
> 
> ...above acpi_numa_slit_init()

Done.

     prev parent reply	other threads:[~2021-10-29 22:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19  5:09 [PATCH v3] ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT alison.schofield
2021-10-20 15:26 ` Rafael J. Wysocki
2021-10-20 15:48   ` Rafael J. Wysocki
2021-10-26  1:09   ` Dan Williams
2021-10-26 13:17     ` Rafael J. Wysocki
2021-10-20 22:03 ` Vikram Sethi
2021-10-21  1:00   ` Alison Schofield
2021-10-21 15:56     ` Vikram Sethi
2021-10-22  2:01       ` Dan Williams
2021-10-22 21:58         ` Vikram Sethi
2021-10-25 19:43           ` Dan Williams
2021-10-26  2:47 ` Dan Williams
2021-10-29 22:49   ` Alison Schofield [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211029224914.GA500689@alison-desk \
    --to=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox