From: Dan Williams <dan.j.williams@intel.com>
To: Dan Williams <dan.j.williams@intel.com>,
Robert Richter <rrichter@amd.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
"Alison Schofield" <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Ben Widawsky <bwidawsk@kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-kernel@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
"Terry Bowman" <terry.bowman@amd.com>
Subject: Re: [PATCH v12 01/20] cxl/port: Fix release of RCD endpoints
Date: Fri, 27 Oct 2023 18:39:28 -0700 [thread overview]
Message-ID: <653c66507bf8d_244c8f294ea@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <653c5691a2372_780ef2949b@dwillia2-xfh.jf.intel.com.notmuch>
Dan Williams wrote:
> Robert Richter wrote:
> > Dan,
> [..]
> >
> > delete_endpoint() is called here, but the uport etc. is not unbound.
> > Which means this is not true:
> >
> > if (parent->driver && !endpoint->dead) {
> > ...
> >
> > I don't remember this with my patch. The parent is there different, so
> > that could be the reason.
> >
> > I could not yet look into more detail but wanted to let you know. Will
> > continue.
>
> Apologies, I didn't have that regression going, I think I see the issue.
> Thanks for the heads up.
Here is the incremental fix on top of the lifetime fix:
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 6230ddfc0be8..0fe915ec2cc2 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1217,30 +1217,39 @@ static struct device *grandparent(struct device *dev)
return NULL;
}
+static struct device *endpoint_host(struct cxl_port *endpoint)
+{
+ struct cxl_port *port = to_cxl_port(endpoint->dev.parent);
+
+ if (is_cxl_root(port))
+ return port->uport_dev;
+ return &port->dev;
+}
+
static void delete_endpoint(void *data)
{
struct cxl_memdev *cxlmd = data;
struct cxl_port *endpoint = cxlmd->endpoint;
- struct device *parent = endpoint->dev.parent;
+ struct device *host = endpoint_host(endpoint);
- device_lock(parent);
- if (parent->driver && !endpoint->dead) {
- devm_release_action(parent, cxl_unlink_parent_dport, endpoint);
- devm_release_action(parent, cxl_unlink_uport, endpoint);
- devm_release_action(parent, unregister_port, endpoint);
+ device_lock(host);
+ if (host->driver && !endpoint->dead) {
+ devm_release_action(host, cxl_unlink_parent_dport, endpoint);
+ devm_release_action(host, cxl_unlink_uport, endpoint);
+ devm_release_action(host, unregister_port, endpoint);
}
cxlmd->endpoint = NULL;
- device_unlock(parent);
+ device_unlock(host);
put_device(&endpoint->dev);
- put_device(parent);
+ put_device(host);
}
int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
{
- struct device *parent = endpoint->dev.parent;
+ struct device *host = endpoint_host(endpoint);
struct device *dev = &cxlmd->dev;
- get_device(parent);
+ get_device(host);
get_device(&endpoint->dev);
cxlmd->endpoint = endpoint;
cxlmd->depth = endpoint->depth;
---
...and here is the new regression test so I don't mess up and miss this
again:
diff --git a/cxl/memdev.c b/cxl/memdev.c
index d76a4d86a40a..81dfd4c25b25 100644
--- a/cxl/memdev.c
+++ b/cxl/memdev.c
@@ -752,6 +752,8 @@ static int memdev_action(int argc, const char **argv, struct cxl_ctx *ctx,
if (end[0] == 0)
continue;
} else {
+ unsigned long domain, bus, dev, func;
+
if (strcmp(argv[i], "all") == 0) {
argc = 1;
break;
@@ -760,6 +762,12 @@ static int memdev_action(int argc, const char **argv, struct cxl_ctx *ctx,
continue;
if (sscanf(argv[i], "%lu", &id) == 1)
continue;
+ if (sscanf(argv[i], "%lx:%lx:%lx.%lx", &domain, &bus, &dev, &func))
+ continue;
+ if (sscanf(argv[i], "cxl_mem.%lu", &id))
+ continue;
+ if (sscanf(argv[i], "cxl_rcd.%lu", &id))
+ continue;
}
log_err(&ml, "'%s' is not a valid memdev %s\n", argv[i],
diff --git a/test/cxl-topology.sh b/test/cxl-topology.sh
index 89d01a89ccb1..0320887a953b 100644
--- a/test/cxl-topology.sh
+++ b/test/cxl-topology.sh
@@ -120,6 +120,13 @@ count=$(jq "map(select(.pmem_size == $pmem_size)) | length" <<< $json)
((bridges == 2 && count == 8 || bridges == 3 && count == 10 ||
bridges == 4 && count == 11)) || err "$LINENO"
+# check rcd endpoints disappear when disabling the memdev
+m=$($CXL list -M -b cxl_test | jq -r ".[].host" | grep rcd)
+ep=$($CXL list -E -m $m | jq -r ".[].endpoint")
+$CXL disable-memdev $m --force
+check=$($CXL list -E -e $ep | jq -r ".[].endpoint")
+[ -z "$check" ] || err "$LINENO"
+$CXL enable-memdev $m
next prev parent reply other threads:[~2023-10-28 1:39 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-18 17:16 [PATCH v12 00/20] cxl/pci: Add support for RCH RAS error handling Robert Richter
2023-10-18 17:16 ` [PATCH v12 01/20] cxl/port: Fix release of RCD endpoints Robert Richter
2023-10-27 3:46 ` Dan Williams
2023-10-27 22:55 ` Robert Richter
2023-10-28 0:32 ` Dan Williams
2023-10-28 1:39 ` Dan Williams [this message]
2023-10-29 16:17 ` Robert Richter
2023-10-18 17:16 ` [PATCH v12 02/20] cxl/core/regs: Rename @dev to @host in struct cxl_register_map Robert Richter
2023-10-27 20:04 ` Dan Williams
2023-10-18 17:16 ` [PATCH v12 03/20] cxl/port: Fix @host confusion in cxl_dport_setup_regs() Robert Richter
2023-10-18 17:16 ` [PATCH v12 04/20] cxl/port: Rename @comp_map to @reg_map in struct cxl_register_map Robert Richter
2023-10-18 17:16 ` [PATCH v12 05/20] cxl/port: Pre-initialize component register mappings Robert Richter
2023-10-18 17:16 ` [PATCH v12 06/20] cxl/pci: Store the endpoint's Component Register mappings in struct cxl_dev_state Robert Richter
2023-10-18 17:17 ` [PATCH v12 07/20] cxl/hdm: Use stored Component Register mappings to map HDM decoder capability Robert Richter
2023-10-27 21:51 ` Dan Williams
2023-10-18 17:17 ` [PATCH v12 08/20] cxl/pci: Remove Component Register base address from struct cxl_dev_state Robert Richter
2023-10-27 21:54 ` Dan Williams
2023-10-18 17:17 ` [PATCH v12 09/20] cxl/port: Remove Component Register base address from struct cxl_port Robert Richter
2023-10-18 17:17 ` [PATCH v12 10/20] cxl/pci: Introduce config option PCIEAER_CXL Robert Richter
2023-10-19 14:30 ` Jonathan Cameron
2023-10-20 22:36 ` Robert Richter
2023-10-27 22:02 ` Dan Williams
2023-10-18 17:17 ` [PATCH v12 11/20] cxl/pci: Add RCH downstream port AER register discovery Robert Richter
2023-10-27 22:12 ` Dan Williams
2023-10-18 17:17 ` [PATCH v12 12/20] PCI/AER: Refactor cper_print_aer() for use by CXL driver module Robert Richter
2023-10-18 17:17 ` [PATCH v12 13/20] cxl/pci: Update CXL error logging to use RAS register address Robert Richter
2023-10-18 17:17 ` [PATCH v12 14/20] cxl/pci: Map RCH downstream AER registers for logging protocol errors Robert Richter
2023-10-27 22:16 ` Dan Williams
2023-10-28 3:23 ` Dan Williams
2023-10-18 17:17 ` [PATCH v12 15/20] cxl/pci: Add RCH downstream port error logging Robert Richter
2023-10-18 17:17 ` [PATCH v12 16/20] cxl/pci: Disable root port interrupts in RCH mode Robert Richter
2023-10-18 17:17 ` [PATCH v12 17/20] PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler Robert Richter
2023-10-18 17:17 ` [PATCH v12 18/20] PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling Robert Richter
2023-10-18 17:17 ` [PATCH v12 19/20] cxl/core/regs: Rename phys_addr in cxl_map_component_regs() Robert Richter
2023-10-18 17:17 ` [PATCH v12 20/20] cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm Robert Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=653c66507bf8d_244c8f294ea@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=bwidawsk@kernel.org \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox