Linux CXL
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org, vishal.l.verma@intel.com,
	ira.weiny@intel.com, dave.jiang@intel.com,
	alison.schofield@intel.com, Jonathan.Cameron@huawei.com
Subject: Re: [PATCH] cxl/port: Fix find_cxl_root() for RCDs and simplify it
Date: Wed, 29 Mar 2023 01:22:46 -0400	[thread overview]
Message-ID: <ZCPLJme6t+Vm9wXY@memverge.com> (raw)
In-Reply-To: <168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com>

On Tue, Mar 28, 2023 at 11:36:17AM -0700, Dan Williams wrote:
> The find_cxl_root() helper is used to lookup root decoders and other CXL
> platform topology information for a given endpoint. It turns out that
> for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev)
> is always NULL for the RCH topology case because it expects to find a
> cxl_port at the host-bridge. RCH topologies only have the root cxl_port
> object with the host-bridge as a dport. While there are no reports of
> this being a problem to date, by inspection region enumeration should
> crash as a result of this problem, and it does in a local unit test for
> this scenario.
> 
> However, an observation that ever since:
> 
> commit f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
> 
> ...all callers of find_cxl_root() occur after the memdev connection to
> the port topology has been established. That means that find_cxl_root()
> can be simplified to a walk of the endpoint port topology to the root.
> Switch to that arrangement which also fixes the RCD bug.
> 
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/pmem.c   |    6 +++---
>  drivers/cxl/core/port.c   |   38 +++++++-------------------------------
>  drivers/cxl/core/region.c |    2 +-
>  drivers/cxl/cxl.h         |    4 ++--
>  drivers/cxl/port.c        |    2 +-
>  5 files changed, 14 insertions(+), 38 deletions(-)
> 

Testing this, want to make sure I'm seeing the correct results.  As of
right now, I'm not seeing any regressions, however RCD/RCH combinations
don't appear to work as-expected when marked EFI_MEMORY_SP.

When not marked MEMORY_SP, we see the below topology, which I believe is
the correct result (mildly trimmed for brevity).

The memory is correctly onlined as expected at boot (single socket
system, memory is on node 1)

[user@host0 cxl]# numactl --hardware
available: 2 nodes (0-1)
node 1 cpus:
node 1 size: 128934 MB
node 1 free: 137 MB


When the CXL region is marked EFI_MEMORY_SP in bios, the memory is not
onlined - which is expected, and the topology is the same as before.
However, now attempts to online the memory in a region fail.

[root@amd0 ~]# numactl --hardware
available: 1 nodes (0)

[root@amd0 ~]# /data/ndctl/build/cxl/cxl create-region -t ram -d decoder0.0 -w 1 -g 4096 -m mem0
cxl region: collect_memdevs: no active memdevs found: decoder: decoder0.0 filter: mem0
cxl region: cmd_create_region: created 0 regions

As you can see in the topology below, the memory device is not attached
to a region, which seems to be correct as there's no step between the
root complex and the device?

However if we attempt to disable the memdev we get a failure

[root@amd0 ~]# /data/ndctl/build/cxl/cxl disable-memdev mem0
cxl memdev: action_disable: mem0 is part of an active region
cxl memdev: cmd_disable_memdev: disabled 0 mem

So the device becomes unusable in this configuration.



Is it expected that RCD's will fail when set to EFI_MEMORY_SP?  If
that's the case, then this (and the other patch) look safe and do not
produce regression.

Just want to capture this behavior, as it appears there may be other
issues related to RCH/RCD combinations.

~Gregory




CXL Topology produced by ndctl:cxl (lastest version 76.x).

[user@host0 cxl]# ./cxl list -vvvv
[
  {
    "bus":"root0",
    "provider":"ACPI.CXL",
    "nr_dports":1,
    "dports":[
      {
        "dport":"pci0000:3f",
        "alias":"ACPI0016:00",
        "id":4
      }
    ],
    "endpoints:root0":[
      {
        "endpoint":"endpoint1",
        "host":"mem0",
        "depth":1,
        "memdev":{
          "memdev":"mem0",
          "ram_size":137438953472,
          "numa_node":1,             <------ absent in MEMORY_SP
          "host":"0000:3f:00.0",
          "partition_info":{ ... snip ... }
        },
        "decoders:endpoint1":[
          {
            "decoder":"decoder1.0",
            "resource":0,
            "size":137438953472,
            "interleave_ways":1
          }
        ]
      }
    ],
    "decoders:root0":[
      {
        "decoder":"decoder0.0",
        "resource":70061654016,
        "size":137438953472,
        "interleave_ways":1,
        "max_available_extent":137438953472,
        "volatile_capable":true,
        "nr_targets":1,
        "targets":[
          {
            "target":"pci0000:3f",
            "alias":"ACPI0016:00",
            "position":0,
            "id":4
          }
        ]
      }
    ]
  }
]

  reply	other threads:[~2023-03-29 17:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 18:36 [PATCH] cxl/port: Fix find_cxl_root() for RCDs and simplify it Dan Williams
2023-03-29  5:22 ` Gregory Price [this message]
2023-03-29 21:39   ` Dan Williams
2023-03-29 10:27     ` Gregory Price
2023-03-29 22:21       ` Dan Williams
2023-03-29 10:38         ` Gregory Price
2023-03-29 17:36 ` Dave Jiang
2023-03-30 17:35 ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCPLJme6t+Vm9wXY@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox