* [PATCH v3 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing
@ 2024-06-12 6:44 Li Ming
2024-06-18 2:49 ` Alison Schofield
0 siblings, 1 reply; 2+ messages in thread
From: Li Ming @ 2024-06-12 6:44 UTC (permalink / raw)
To: linux-cxl; +Cc: Li Ming, Dan Williams, Alison Schofield, Jonathan Cameron
When CXL subsystem is auto-assembling a pmem region during cxl
endpoint port probing, always hit below calltrace.
BUG: kernel NULL pointer dereference, address: 0000000000000078
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
Call Trace:
<TASK>
? __die+0x24/0x70
? page_fault_oops+0x82/0x160
? do_user_addr_fault+0x65/0x6b0
? exc_page_fault+0x7d/0x170
? asm_exc_page_fault+0x26/0x30
? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
cxl_bus_probe+0x1b/0x60 [cxl_core]
really_probe+0x173/0x410
? __pfx___device_attach_driver+0x10/0x10
__driver_probe_device+0x80/0x170
driver_probe_device+0x1e/0x90
__device_attach_driver+0x90/0x120
bus_for_each_drv+0x84/0xe0
__device_attach+0xbc/0x1f0
bus_probe_device+0x90/0xa0
device_add+0x51c/0x710
devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
cxl_bus_probe+0x1b/0x60 [cxl_core]
The cxl_nvd of the memdev needs to be available during pmem region
probe. Currently the cxl_nvd is registered after the end port probe. The
end point probe, in the case of autoassembly of regions, can cause a
pmem region probe requiring the not yet available cxl_nvd. Adjust the
sequence so this dependency is met.
This requires adding a port parameter to cxl_find_nvdimm_bridge() that
can be used to query the ancestor root port. The endpoint port is not
yet available, but will share a common ancestor with it's parent, so
start the query from there instead.
Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Ming <ming4.li@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v3:
- rename the first parameter of devm_cxl_add_nvdimm() from port to
parent_port.(Jonathan)
- adjust changelog to be clearer.(Jonathan)
- add missing "Signed-off-by" and "Co-developed-by" for Dan.
v2:
- fix wrong function name in the description of
cxl_find_nvdimm_bridge().(lkp)
drivers/cxl/core/pmem.c | 16 +++++++++++-----
drivers/cxl/core/region.c | 2 +-
drivers/cxl/cxl.h | 4 ++--
drivers/cxl/mem.c | 17 +++++++++--------
4 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index e69625a8d6a1..c00f3a933164 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device *dev, void *data)
return is_cxl_nvdimm_bridge(dev);
}
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd)
+/**
+ * cxl_find_nvdimm_bridge() - find a bridge device relative to a port
+ * @port: any descendant port of an nvdimm-bridge associated
+ * root-cxl-port
+ */
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port)
{
- struct cxl_root *cxl_root __free(put_cxl_root) =
- find_cxl_root(cxlmd->endpoint);
+ struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
struct device *dev;
if (!cxl_root)
@@ -242,18 +246,20 @@ static void cxlmd_release_nvdimm(void *_cxlmd)
/**
* devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
+ * @parent_port: parent port for the (to be added) @cxlmd endpoint port
* @cxlmd: cxl_memdev instance that will perform LIBNVDIMM operations
*
* Return: 0 on success negative error code on failure.
*/
-int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd)
+int devm_cxl_add_nvdimm(struct cxl_port *parent_port,
+ struct cxl_memdev *cxlmd)
{
struct cxl_nvdimm_bridge *cxl_nvb;
struct cxl_nvdimm *cxl_nvd;
struct device *dev;
int rc;
- cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
+ cxl_nvb = cxl_find_nvdimm_bridge(parent_port);
if (!cxl_nvb)
return -ENODEV;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3c2b6144be23..f0cafc7ffb45 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
* bridge for one device is the same for all.
*/
if (i == 0) {
- cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
+ cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
if (!cxl_nvb)
return -ENODEV;
cxlr->cxl_nvb = cxl_nvb;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 603c0120cff8..42928926e0b2 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
bool is_cxl_nvdimm(struct device *dev);
bool is_cxl_nvdimm_bridge(struct device *dev);
-int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd);
+int devm_cxl_add_nvdimm(struct cxl_port *parent_port, struct cxl_memdev *cxlmd);
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port);
#ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 0c79d9ce877c..2f1b49bfe162 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev)
return -ENXIO;
}
+ if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
+ rc = devm_cxl_add_nvdimm(parent_port, cxlmd);
+ if (rc) {
+ if (rc == -ENODEV)
+ dev_info(dev, "PMEM disabled by platform\n");
+ return rc;
+ }
+ }
+
if (dport->rch)
endpoint_parent = parent_port->uport_dev;
else
@@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev)
if (rc)
return rc;
- if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
- rc = devm_cxl_add_nvdimm(cxlmd);
- if (rc == -ENODEV)
- dev_info(dev, "PMEM disabled by platform\n");
- else
- return rc;
- }
-
/*
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
--
base-commit: 49ba7b515c4c0719b866d16f068e62d16a8a3dd1
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v3 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing
2024-06-12 6:44 [PATCH v3 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming
@ 2024-06-18 2:49 ` Alison Schofield
0 siblings, 0 replies; 2+ messages in thread
From: Alison Schofield @ 2024-06-18 2:49 UTC (permalink / raw)
To: Li Ming; +Cc: linux-cxl, Dan Williams, Jonathan Cameron
On Wed, Jun 12, 2024 at 02:44:23PM +0800, Li Ming wrote:
> When CXL subsystem is auto-assembling a pmem region during cxl
> endpoint port probing, always hit below calltrace.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000078
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
> Call Trace:
> <TASK>
> ? __die+0x24/0x70
> ? page_fault_oops+0x82/0x160
> ? do_user_addr_fault+0x65/0x6b0
> ? exc_page_fault+0x7d/0x170
> ? asm_exc_page_fault+0x26/0x30
> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
> cxl_bus_probe+0x1b/0x60 [cxl_core]
> really_probe+0x173/0x410
> ? __pfx___device_attach_driver+0x10/0x10
> __driver_probe_device+0x80/0x170
> driver_probe_device+0x1e/0x90
> __device_attach_driver+0x90/0x120
> bus_for_each_drv+0x84/0xe0
> __device_attach+0xbc/0x1f0
> bus_probe_device+0x90/0xa0
> device_add+0x51c/0x710
> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
> cxl_bus_probe+0x1b/0x60 [cxl_core]
>
> The cxl_nvd of the memdev needs to be available during pmem region
> probe. Currently the cxl_nvd is registered after the end port probe. The
> end point probe, in the case of autoassembly of regions, can cause a
> pmem region probe requiring the not yet available cxl_nvd. Adjust the
> sequence so this dependency is met.
> This requires adding a port parameter to cxl_find_nvdimm_bridge() that
> can be used to query the ancestor root port. The endpoint port is not
> yet available, but will share a common ancestor with it's parent, so
> start the query from there instead.
>
> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Li Ming <ming4.li@intel.com>
> Tested-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
DaveJ please clean-up commit msg misspelling when applying.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-06-18 2:49 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-12 6:44 [PATCH v3 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming
2024-06-18 2:49 ` Alison Schofield
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox