* [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing
@ 2024-05-31 7:02 Li Ming
2024-05-31 13:00 ` kernel test robot
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Li Ming @ 2024-05-31 7:02 UTC (permalink / raw)
To: linux-cxl, dan.j.williams; +Cc: Li Ming
When CXL subsystem is auto-assembling a pmem region during cxl
endpoint port probing, always output below calltrace.
BUG: kernel NULL pointer dereference, address: 0000000000000078
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
Call Trace:
<TASK>
? __die+0x24/0x70
? page_fault_oops+0x82/0x160
? do_user_addr_fault+0x65/0x6b0
? exc_page_fault+0x7d/0x170
? asm_exc_page_fault+0x26/0x30
? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
cxl_bus_probe+0x1b/0x60 [cxl_core]
really_probe+0x173/0x410
? __pfx___device_attach_driver+0x10/0x10
__driver_probe_device+0x80/0x170
driver_probe_device+0x1e/0x90
__device_attach_driver+0x90/0x120
bus_for_each_drv+0x84/0xe0
__device_attach+0xbc/0x1f0
bus_probe_device+0x90/0xa0
device_add+0x51c/0x710
devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
cxl_bus_probe+0x1b/0x60 [cxl_core]
Because the cxl_nvd of the memdev is necessary during pmem region
probing, but the cxl_nvd can be registered only after endpoint port
probing done, that is a collision dependency, so adjust the sequence
between cxl_nvd registration and endpoint port registration to guarantee
there is a cxl_nvd in memdev during the pmem region auto-assembling.
Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Ming <ming4.li@intel.com>
---
drivers/cxl/core/pmem.c | 15 ++++++++++-----
drivers/cxl/core/region.c | 2 +-
drivers/cxl/cxl.h | 4 ++--
drivers/cxl/mem.c | 17 +++++++++--------
4 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index e69625a8d6a1..31b398c13be9 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device *dev, void *data)
return is_cxl_nvdimm_bridge(dev);
}
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd)
+/**
+ * cxl_nvdimm_bridge() - find a bridge device relative to a port
+ * @port: any descendant port of an nvdimm-bridge associated
+ * root-cxl-port
+ */
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port)
{
- struct cxl_root *cxl_root __free(put_cxl_root) =
- find_cxl_root(cxlmd->endpoint);
+ struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
struct device *dev;
if (!cxl_root)
@@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void *_cxlmd)
/**
* devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
+ * @port: parent port for the (to be added) @cxlmd endpoint port
* @cxlmd: cxl_memdev instance that will perform LIBNVDIMM operations
*
* Return: 0 on success negative error code on failure.
*/
-int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd)
+int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd)
{
struct cxl_nvdimm_bridge *cxl_nvb;
struct cxl_nvdimm *cxl_nvd;
struct device *dev;
int rc;
- cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
+ cxl_nvb = cxl_find_nvdimm_bridge(port);
if (!cxl_nvb)
return -ENODEV;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3c2b6144be23..f0cafc7ffb45 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
* bridge for one device is the same for all.
*/
if (i == 0) {
- cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
+ cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
if (!cxl_nvb)
return -ENODEV;
cxlr->cxl_nvb = cxl_nvb;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 603c0120cff8..e8fca6c6952b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
bool is_cxl_nvdimm(struct device *dev);
bool is_cxl_nvdimm_bridge(struct device *dev);
-int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd);
+int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd);
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port);
#ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 0c79d9ce877c..2f1b49bfe162 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev)
return -ENXIO;
}
+ if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
+ rc = devm_cxl_add_nvdimm(parent_port, cxlmd);
+ if (rc) {
+ if (rc == -ENODEV)
+ dev_info(dev, "PMEM disabled by platform\n");
+ return rc;
+ }
+ }
+
if (dport->rch)
endpoint_parent = parent_port->uport_dev;
else
@@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev)
if (rc)
return rc;
- if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
- rc = devm_cxl_add_nvdimm(cxlmd);
- if (rc == -ENODEV)
- dev_info(dev, "PMEM disabled by platform\n");
- else
- return rc;
- }
-
/*
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-05-31 7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming @ 2024-05-31 13:00 ` kernel test robot 2024-06-05 11:57 ` Jonathan Cameron 2024-06-05 15:42 ` Alison Schofield 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2024-05-31 13:00 UTC (permalink / raw) To: Li Ming, linux-cxl, dan.j.williams; +Cc: llvm, oe-kbuild-all, Li Ming Hi Li, kernel test robot noticed the following build warnings: [auto build test WARNING on cxl/next] [also build test WARNING on linus/master v6.10-rc1 next-20240531] [cannot apply to cxl/pending] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Li-Ming/cxl-mem-Fix-no-cxl_nvd-during-pmem-region-auto-assembing/20240531-150446 base: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git next patch link: https://lore.kernel.org/r/20240531070229.1596811-1-ming4.li%40intel.com patch subject: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing config: i386-buildonly-randconfig-001-20240531 (https://download.01.org/0day-ci/archive/20240531/202405312008.SFZRRSYU-lkp@intel.com/config) compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240531/202405312008.SFZRRSYU-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202405312008.SFZRRSYU-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/cxl/core/pmem.c:71: warning: expecting prototype for cxl_nvdimm_bridge(). Prototype was for cxl_find_nvdimm_bridge() instead vim +71 drivers/cxl/core/pmem.c 2e52b6256b9af2 Dan Williams 2021-09-14 64 0417bd9707cb07 Li Ming 2024-05-31 65 /** 0417bd9707cb07 Li Ming 2024-05-31 66 * cxl_nvdimm_bridge() - find a bridge device relative to a port 0417bd9707cb07 Li Ming 2024-05-31 67 * @port: any descendant port of an nvdimm-bridge associated 0417bd9707cb07 Li Ming 2024-05-31 68 * root-cxl-port 0417bd9707cb07 Li Ming 2024-05-31 69 */ 0417bd9707cb07 Li Ming 2024-05-31 70 struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port) 2e52b6256b9af2 Dan Williams 2021-09-14 @71 { 0417bd9707cb07 Li Ming 2024-05-31 72 struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port); 2e52b6256b9af2 Dan Williams 2021-09-14 73 struct device *dev; 2e52b6256b9af2 Dan Williams 2021-09-14 74 44cd71ef7bacdf Dave Jiang 2024-01-05 75 if (!cxl_root) a46cfc0f011ce7 Dan Williams 2022-01-31 76 return NULL; a46cfc0f011ce7 Dan Williams 2022-01-31 77 66f11890d35a60 Dave Jiang 2024-01-05 78 dev = device_find_child(&cxl_root->port.dev, NULL, match_nvdimm_bridge); a46cfc0f011ce7 Dan Williams 2022-01-31 79 2e52b6256b9af2 Dan Williams 2021-09-14 80 if (!dev) 2e52b6256b9af2 Dan Williams 2021-09-14 81 return NULL; a46cfc0f011ce7 Dan Williams 2022-01-31 82 2e52b6256b9af2 Dan Williams 2021-09-14 83 return to_cxl_nvdimm_bridge(dev); 2e52b6256b9af2 Dan Williams 2021-09-14 84 } affec782742e08 Dan Williams 2021-11-12 85 EXPORT_SYMBOL_NS_GPL(cxl_find_nvdimm_bridge, CXL); 2e52b6256b9af2 Dan Williams 2021-09-14 86 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-05-31 7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming 2024-05-31 13:00 ` kernel test robot @ 2024-06-05 11:57 ` Jonathan Cameron 2024-06-06 2:30 ` Li, Ming 2024-06-05 15:42 ` Alison Schofield 2 siblings, 1 reply; 8+ messages in thread From: Jonathan Cameron @ 2024-06-05 11:57 UTC (permalink / raw) To: Li Ming; +Cc: linux-cxl, dan.j.williams On Fri, 31 May 2024 15:02:29 +0800 Li Ming <ming4.li@intel.com> wrote: > When CXL subsystem is auto-assembling a pmem region during cxl > endpoint port probing, always output below calltrace. > > BUG: kernel NULL pointer dereference, address: 0000000000000078 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > Call Trace: > <TASK> > ? __die+0x24/0x70 > ? page_fault_oops+0x82/0x160 > ? do_user_addr_fault+0x65/0x6b0 > ? exc_page_fault+0x7d/0x170 > ? asm_exc_page_fault+0x26/0x30 > ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] > cxl_bus_probe+0x1b/0x60 [cxl_core] > really_probe+0x173/0x410 > ? __pfx___device_attach_driver+0x10/0x10 > __driver_probe_device+0x80/0x170 > driver_probe_device+0x1e/0x90 > __device_attach_driver+0x90/0x120 > bus_for_each_drv+0x84/0xe0 > __device_attach+0xbc/0x1f0 > bus_probe_device+0x90/0xa0 > device_add+0x51c/0x710 > devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] > cxl_bus_probe+0x1b/0x60 [cxl_core] > > Because the cxl_nvd of the memdev is necessary during pmem region > probing, but the cxl_nvd can be registered only after endpoint port > probing done, that is a collision dependency, so adjust the sequence > between cxl_nvd registration and endpoint port registration to guarantee > there is a cxl_nvd in memdev during the pmem region auto-assembling. Perhaps call out that the root above a parent port is the same as the root above the endpoint seeing as I think you are starting the search from a different location after this change. Other than that looks correct to me. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Li Ming <ming4.li@intel.com> > --- > drivers/cxl/core/pmem.c | 15 ++++++++++----- > drivers/cxl/core/region.c | 2 +- > drivers/cxl/cxl.h | 4 ++-- > drivers/cxl/mem.c | 17 +++++++++-------- > 4 files changed, 22 insertions(+), 16 deletions(-) > > diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c > index e69625a8d6a1..31b398c13be9 100644 > --- a/drivers/cxl/core/pmem.c > +++ b/drivers/cxl/core/pmem.c > @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device *dev, void *data) > return is_cxl_nvdimm_bridge(dev); > } > > -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd) > +/** > + * cxl_nvdimm_bridge() - find a bridge device relative to a port > + * @port: any descendant port of an nvdimm-bridge associated > + * root-cxl-port > + */ > +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port) > { > - struct cxl_root *cxl_root __free(put_cxl_root) = > - find_cxl_root(cxlmd->endpoint); > + struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port); This is a different port in the now earlier query (not the other path you update). As you say any descendant is fine though. I'd mention this subtle change in the patch description though. (noted above) > struct device *dev; > > if (!cxl_root) > @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void *_cxlmd) > > /** > * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm > + * @port: parent port for the (to be added) @cxlmd endpoint port Would calling it parent_port make more sense? > * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM operations > * > * Return: 0 on success negative error code on failure. > */ > -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd) > +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd) > { > struct cxl_nvdimm_bridge *cxl_nvb; > struct cxl_nvdimm *cxl_nvd; > struct device *dev; > int rc; > > - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); > + cxl_nvb = cxl_find_nvdimm_bridge(port); > if (!cxl_nvb) > return -ENODEV; > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 3c2b6144be23..f0cafc7ffb45 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct cxl_region *cxlr) > * bridge for one device is the same for all. > */ > if (i == 0) { > - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); > + cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint); > if (!cxl_nvb) > return -ENODEV; > cxlr->cxl_nvb = cxl_nvb; > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 603c0120cff8..e8fca6c6952b 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host, > struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev); > bool is_cxl_nvdimm(struct device *dev); > bool is_cxl_nvdimm_bridge(struct device *dev); > -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd); > -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd); > +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd); > +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port); > > #ifdef CONFIG_CXL_REGION > bool is_cxl_pmem_region(struct device *dev); > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c > index 0c79d9ce877c..2f1b49bfe162 100644 > --- a/drivers/cxl/mem.c > +++ b/drivers/cxl/mem.c > @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev) > return -ENXIO; > } > > + if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) { > + rc = devm_cxl_add_nvdimm(parent_port, cxlmd); > + if (rc) { > + if (rc == -ENODEV) > + dev_info(dev, "PMEM disabled by platform\n"); > + return rc; > + } > + } > + > if (dport->rch) > endpoint_parent = parent_port->uport_dev; > else > @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev) > if (rc) > return rc; > > - if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) { > - rc = devm_cxl_add_nvdimm(cxlmd); > - if (rc == -ENODEV) > - dev_info(dev, "PMEM disabled by platform\n"); > - else > - return rc; > - } > - > /* > * The kernel may be operating out of CXL memory on this device, > * there is no spec defined way to determine whether this device ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-06-05 11:57 ` Jonathan Cameron @ 2024-06-06 2:30 ` Li, Ming 2024-06-06 14:09 ` Jonathan Cameron 0 siblings, 1 reply; 8+ messages in thread From: Li, Ming @ 2024-06-06 2:30 UTC (permalink / raw) To: Jonathan Cameron; +Cc: linux-cxl, dan.j.williams On 6/5/2024 7:57 PM, Jonathan Cameron wrote: > On Fri, 31 May 2024 15:02:29 +0800 > Li Ming <ming4.li@intel.com> wrote: > >> When CXL subsystem is auto-assembling a pmem region during cxl >> endpoint port probing, always output below calltrace. >> >> BUG: kernel NULL pointer dereference, address: 0000000000000078 >> #PF: supervisor read access in kernel mode >> #PF: error_code(0x0000) - not-present page >> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >> Call Trace: >> <TASK> >> ? __die+0x24/0x70 >> ? page_fault_oops+0x82/0x160 >> ? do_user_addr_fault+0x65/0x6b0 >> ? exc_page_fault+0x7d/0x170 >> ? asm_exc_page_fault+0x26/0x30 >> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] >> cxl_bus_probe+0x1b/0x60 [cxl_core] >> really_probe+0x173/0x410 >> ? __pfx___device_attach_driver+0x10/0x10 >> __driver_probe_device+0x80/0x170 >> driver_probe_device+0x1e/0x90 >> __device_attach_driver+0x90/0x120 >> bus_for_each_drv+0x84/0xe0 >> __device_attach+0xbc/0x1f0 >> bus_probe_device+0x90/0xa0 >> device_add+0x51c/0x710 >> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] >> cxl_bus_probe+0x1b/0x60 [cxl_core] >> >> Because the cxl_nvd of the memdev is necessary during pmem region >> probing, but the cxl_nvd can be registered only after endpoint port >> probing done, that is a collision dependency, so adjust the sequence >> between cxl_nvd registration and endpoint port registration to guarantee >> there is a cxl_nvd in memdev during the pmem region auto-assembling. > > Perhaps call out that the root above a parent port is the same as the root > above the endpoint seeing as I think you are starting the search from > a different location after this change. Hi Jonathan, Thanks for your review. what do you think if I change the description as below: "Because the cxl_nvd of the memdev is necessary during pmem region probing, but the cxl_nvd can be registered only after endpoint port probing done, that is a collision dependency, so adjust the sequence between cxl_nvd registration and endpoint port registration to guarantee there is a cxl_nvd in memdev during the pmem region auto-assembling. For that, change cxl_find_nvdimm_bridge() to use a port to query the ancestor root port, it helps to find the root port of an endpoint by using an endpoint's parent port so that cxl_nvd registration can be finished before the endpoint attached to the CXL topology" > > Other than that looks correct to me. > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > >> >> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") >> Suggested-by: Dan Williams <dan.j.williams@intel.com> >> Signed-off-by: Li Ming <ming4.li@intel.com> >> --- >> drivers/cxl/core/pmem.c | 15 ++++++++++----- >> drivers/cxl/core/region.c | 2 +- >> drivers/cxl/cxl.h | 4 ++-- >> drivers/cxl/mem.c | 17 +++++++++-------- >> 4 files changed, 22 insertions(+), 16 deletions(-) >> >> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c >> index e69625a8d6a1..31b398c13be9 100644 >> --- a/drivers/cxl/core/pmem.c >> +++ b/drivers/cxl/core/pmem.c >> @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device *dev, void *data) >> return is_cxl_nvdimm_bridge(dev); >> } >> >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd) >> +/** >> + * cxl_nvdimm_bridge() - find a bridge device relative to a port >> + * @port: any descendant port of an nvdimm-bridge associated >> + * root-cxl-port >> + */ >> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port) >> { >> - struct cxl_root *cxl_root __free(put_cxl_root) = >> - find_cxl_root(cxlmd->endpoint); >> + struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port); > > This is a different port in the now earlier query (not the other path you update). > As you say any descendant is fine though. > I'd mention this subtle change in the patch description though. > (noted above) > >> struct device *dev; >> >> if (!cxl_root) >> @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void *_cxlmd) >> >> /** >> * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm >> + * @port: parent port for the (to be added) @cxlmd endpoint port > > Would calling it parent_port make more sense? Yes, will change it, thanks. > >> * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM operations >> * >> * Return: 0 on success negative error code on failure. >> */ >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd) >> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd) >> { >> struct cxl_nvdimm_bridge *cxl_nvb; >> struct cxl_nvdimm *cxl_nvd; >> struct device *dev; >> int rc; >> >> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); >> + cxl_nvb = cxl_find_nvdimm_bridge(port); >> if (!cxl_nvb) >> return -ENODEV; >> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c >> index 3c2b6144be23..f0cafc7ffb45 100644 >> --- a/drivers/cxl/core/region.c >> +++ b/drivers/cxl/core/region.c >> @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct cxl_region *cxlr) >> * bridge for one device is the same for all. >> */ >> if (i == 0) { >> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); >> + cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint); >> if (!cxl_nvb) >> return -ENODEV; >> cxlr->cxl_nvb = cxl_nvb; >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >> index 603c0120cff8..e8fca6c6952b 100644 >> --- a/drivers/cxl/cxl.h >> +++ b/drivers/cxl/cxl.h >> @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host, >> struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev); >> bool is_cxl_nvdimm(struct device *dev); >> bool is_cxl_nvdimm_bridge(struct device *dev); >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd); >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd); >> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev *cxlmd); >> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port); >> >> #ifdef CONFIG_CXL_REGION >> bool is_cxl_pmem_region(struct device *dev); >> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c >> index 0c79d9ce877c..2f1b49bfe162 100644 >> --- a/drivers/cxl/mem.c >> +++ b/drivers/cxl/mem.c >> @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev) >> return -ENXIO; >> } >> >> + if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) { >> + rc = devm_cxl_add_nvdimm(parent_port, cxlmd); >> + if (rc) { >> + if (rc == -ENODEV) >> + dev_info(dev, "PMEM disabled by platform\n"); >> + return rc; >> + } >> + } >> + >> if (dport->rch) >> endpoint_parent = parent_port->uport_dev; >> else >> @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev) >> if (rc) >> return rc; >> >> - if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) { >> - rc = devm_cxl_add_nvdimm(cxlmd); >> - if (rc == -ENODEV) >> - dev_info(dev, "PMEM disabled by platform\n"); >> - else >> - return rc; >> - } >> - >> /* >> * The kernel may be operating out of CXL memory on this device, >> * there is no spec defined way to determine whether this device > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-06-06 2:30 ` Li, Ming @ 2024-06-06 14:09 ` Jonathan Cameron 2024-06-07 1:29 ` Li, Ming 0 siblings, 1 reply; 8+ messages in thread From: Jonathan Cameron @ 2024-06-06 14:09 UTC (permalink / raw) To: Li, Ming; +Cc: linux-cxl, dan.j.williams On Thu, 6 Jun 2024 10:30:01 +0800 "Li, Ming" <ming4.li@intel.com> wrote: > On 6/5/2024 7:57 PM, Jonathan Cameron wrote: > > On Fri, 31 May 2024 15:02:29 +0800 > > Li Ming <ming4.li@intel.com> wrote: > > > >> When CXL subsystem is auto-assembling a pmem region during cxl > >> endpoint port probing, always output below calltrace. > >> > >> BUG: kernel NULL pointer dereference, address: 0000000000000078 > >> #PF: supervisor read access in kernel mode > >> #PF: error_code(0x0000) - not-present page > >> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > >> Call Trace: > >> <TASK> > >> ? __die+0x24/0x70 > >> ? page_fault_oops+0x82/0x160 > >> ? do_user_addr_fault+0x65/0x6b0 > >> ? exc_page_fault+0x7d/0x170 > >> ? asm_exc_page_fault+0x26/0x30 > >> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > >> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] > >> cxl_bus_probe+0x1b/0x60 [cxl_core] > >> really_probe+0x173/0x410 > >> ? __pfx___device_attach_driver+0x10/0x10 > >> __driver_probe_device+0x80/0x170 > >> driver_probe_device+0x1e/0x90 > >> __device_attach_driver+0x90/0x120 > >> bus_for_each_drv+0x84/0xe0 > >> __device_attach+0xbc/0x1f0 > >> bus_probe_device+0x90/0xa0 > >> device_add+0x51c/0x710 > >> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] > >> cxl_bus_probe+0x1b/0x60 [cxl_core] > >> > >> Because the cxl_nvd of the memdev is necessary during pmem region > >> probing, but the cxl_nvd can be registered only after endpoint port > >> probing done, that is a collision dependency, so adjust the sequence > >> between cxl_nvd registration and endpoint port registration to guarantee > >> there is a cxl_nvd in memdev during the pmem region auto-assembling. > > > > Perhaps call out that the root above a parent port is the same as the root > > above the endpoint seeing as I think you are starting the search from > > a different location after this change. > > Hi Jonathan, > > Thanks for your review. > what do you think if I change the description as below: > > "Because the cxl_nvd of the memdev is necessary during pmem region > probing, but the cxl_nvd can be registered only after endpoint port > probing done, that is a collision dependency, so adjust the sequence > between cxl_nvd registration and endpoint port registration to > guarantee there is a cxl_nvd in memdev during the pmem region > auto-assembling. For that, change cxl_find_nvdimm_bridge() to use a > port to query the ancestor root port, it helps to find the root port > of an endpoint by using an endpoint's parent port so that cxl_nvd > registration can be finished before the endpoint attached to the CXL > topology" Perhaps this wording is clearer? "The cxl_nvd of the memdev needs to be available during pmem region probe. Currently the cxl_nvd is registered after the end port probe. The end point prove, in the case of autoassembly of regions, can cause a pmem region probe requiring the not yet available cxl_nvd. Adjust the sequence so this dependency is met. This requires adding a port parameter to cxl_find_nvdimm_bridge() that can be used to query the ancestor root port. The endpoint port is not yet available, but will share a common ancestor with it's parent, so start the query from there instead." > > > > > > Other than that looks correct to me. > > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > >> > >> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device > >> registration, delete the workqueue") Suggested-by: Dan Williams > >> <dan.j.williams@intel.com> Signed-off-by: Li Ming > >> <ming4.li@intel.com> --- > >> drivers/cxl/core/pmem.c | 15 ++++++++++----- > >> drivers/cxl/core/region.c | 2 +- > >> drivers/cxl/cxl.h | 4 ++-- > >> drivers/cxl/mem.c | 17 +++++++++-------- > >> 4 files changed, 22 insertions(+), 16 deletions(-) > >> > >> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c > >> index e69625a8d6a1..31b398c13be9 100644 > >> --- a/drivers/cxl/core/pmem.c > >> +++ b/drivers/cxl/core/pmem.c > >> @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device > >> *dev, void *data) return is_cxl_nvdimm_bridge(dev); > >> } > >> > >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct > >> cxl_memdev *cxlmd) +/** > >> + * cxl_nvdimm_bridge() - find a bridge device relative to a port > >> + * @port: any descendant port of an nvdimm-bridge associated > >> + * root-cxl-port > >> + */ > >> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port > >> *port) { > >> - struct cxl_root *cxl_root __free(put_cxl_root) = > >> - find_cxl_root(cxlmd->endpoint); > >> + struct cxl_root *cxl_root __free(put_cxl_root) = > >> find_cxl_root(port); > > > > This is a different port in the now earlier query (not the other > > path you update). As you say any descendant is fine though. > > I'd mention this subtle change in the patch description though. > > (noted above) > > > >> struct device *dev; > >> > >> if (!cxl_root) > >> @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void > >> *_cxlmd) > >> /** > >> * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and > >> an nvdimm > >> + * @port: parent port for the (to be added) @cxlmd endpoint port > > > > Would calling it parent_port make more sense? > > Yes, will change it, thanks. > > > > > >> * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM > >> operations * > >> * Return: 0 on success negative error code on failure. > >> */ > >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd) > >> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev > >> *cxlmd) { > >> struct cxl_nvdimm_bridge *cxl_nvb; > >> struct cxl_nvdimm *cxl_nvd; > >> struct device *dev; > >> int rc; > >> > >> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); > >> + cxl_nvb = cxl_find_nvdimm_bridge(port); > >> if (!cxl_nvb) > >> return -ENODEV; > >> > >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > >> index 3c2b6144be23..f0cafc7ffb45 100644 > >> --- a/drivers/cxl/core/region.c > >> +++ b/drivers/cxl/core/region.c > >> @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct > >> cxl_region *cxlr) > >> * bridge for one device is the same for all. > >> */ > >> if (i == 0) { > >> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); > >> + cxl_nvb = > >> cxl_find_nvdimm_bridge(cxlmd->endpoint); if (!cxl_nvb) > >> return -ENODEV; > >> cxlr->cxl_nvb = cxl_nvb; > >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > >> index 603c0120cff8..e8fca6c6952b 100644 > >> --- a/drivers/cxl/cxl.h > >> +++ b/drivers/cxl/cxl.h > >> @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge > >> *devm_cxl_add_nvdimm_bridge(struct device *host, struct cxl_nvdimm > >> *to_cxl_nvdimm(struct device *dev); bool is_cxl_nvdimm(struct > >> device *dev); bool is_cxl_nvdimm_bridge(struct device *dev); > >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd); > >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct > >> cxl_memdev *cxlmd); +int devm_cxl_add_nvdimm(struct cxl_port > >> *port, struct cxl_memdev *cxlmd); +struct cxl_nvdimm_bridge > >> *cxl_find_nvdimm_bridge(struct cxl_port *port); > >> #ifdef CONFIG_CXL_REGION > >> bool is_cxl_pmem_region(struct device *dev); > >> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c > >> index 0c79d9ce877c..2f1b49bfe162 100644 > >> --- a/drivers/cxl/mem.c > >> +++ b/drivers/cxl/mem.c > >> @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev) > >> return -ENXIO; > >> } > >> > >> + if (resource_size(&cxlds->pmem_res) && > >> IS_ENABLED(CONFIG_CXL_PMEM)) { > >> + rc = devm_cxl_add_nvdimm(parent_port, cxlmd); > >> + if (rc) { > >> + if (rc == -ENODEV) > >> + dev_info(dev, "PMEM disabled by > >> platform\n"); > >> + return rc; > >> + } > >> + } > >> + > >> if (dport->rch) > >> endpoint_parent = parent_port->uport_dev; > >> else > >> @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev) > >> if (rc) > >> return rc; > >> > >> - if (resource_size(&cxlds->pmem_res) && > >> IS_ENABLED(CONFIG_CXL_PMEM)) { > >> - rc = devm_cxl_add_nvdimm(cxlmd); > >> - if (rc == -ENODEV) > >> - dev_info(dev, "PMEM disabled by > >> platform\n"); > >> - else > >> - return rc; > >> - } > >> - > >> /* > >> * The kernel may be operating out of CXL memory on this > >> device, > >> * there is no spec defined way to determine whether this > >> device > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-06-06 14:09 ` Jonathan Cameron @ 2024-06-07 1:29 ` Li, Ming 0 siblings, 0 replies; 8+ messages in thread From: Li, Ming @ 2024-06-07 1:29 UTC (permalink / raw) To: Jonathan Cameron; +Cc: linux-cxl, dan.j.williams On 6/6/2024 10:09 PM, Jonathan Cameron wrote: > On Thu, 6 Jun 2024 10:30:01 +0800 > "Li, Ming" <ming4.li@intel.com> wrote: > >> On 6/5/2024 7:57 PM, Jonathan Cameron wrote: >>> On Fri, 31 May 2024 15:02:29 +0800 >>> Li Ming <ming4.li@intel.com> wrote: >>> >>>> When CXL subsystem is auto-assembling a pmem region during cxl >>>> endpoint port probing, always output below calltrace. >>>> >>>> BUG: kernel NULL pointer dereference, address: 0000000000000078 >>>> #PF: supervisor read access in kernel mode >>>> #PF: error_code(0x0000) - not-present page >>>> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >>>> Call Trace: >>>> <TASK> >>>> ? __die+0x24/0x70 >>>> ? page_fault_oops+0x82/0x160 >>>> ? do_user_addr_fault+0x65/0x6b0 >>>> ? exc_page_fault+0x7d/0x170 >>>> ? asm_exc_page_fault+0x26/0x30 >>>> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >>>> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] >>>> cxl_bus_probe+0x1b/0x60 [cxl_core] >>>> really_probe+0x173/0x410 >>>> ? __pfx___device_attach_driver+0x10/0x10 >>>> __driver_probe_device+0x80/0x170 >>>> driver_probe_device+0x1e/0x90 >>>> __device_attach_driver+0x90/0x120 >>>> bus_for_each_drv+0x84/0xe0 >>>> __device_attach+0xbc/0x1f0 >>>> bus_probe_device+0x90/0xa0 >>>> device_add+0x51c/0x710 >>>> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] >>>> cxl_bus_probe+0x1b/0x60 [cxl_core] >>>> >>>> Because the cxl_nvd of the memdev is necessary during pmem region >>>> probing, but the cxl_nvd can be registered only after endpoint port >>>> probing done, that is a collision dependency, so adjust the sequence >>>> between cxl_nvd registration and endpoint port registration to guarantee >>>> there is a cxl_nvd in memdev during the pmem region auto-assembling. >>> >>> Perhaps call out that the root above a parent port is the same as the root >>> above the endpoint seeing as I think you are starting the search from >>> a different location after this change. >> >> Hi Jonathan, >> >> Thanks for your review. >> what do you think if I change the description as below: >> >> "Because the cxl_nvd of the memdev is necessary during pmem region >> probing, but the cxl_nvd can be registered only after endpoint port >> probing done, that is a collision dependency, so adjust the sequence >> between cxl_nvd registration and endpoint port registration to >> guarantee there is a cxl_nvd in memdev during the pmem region >> auto-assembling. For that, change cxl_find_nvdimm_bridge() to use a >> port to query the ancestor root port, it helps to find the root port >> of an endpoint by using an endpoint's parent port so that cxl_nvd >> registration can be finished before the endpoint attached to the CXL >> topology" > > Perhaps this wording is clearer? > > "The cxl_nvd of the memdev needs to be available during pmem region > probe. Currently the cxl_nvd is registered after the end port > probe. The end point prove, in the case of autoassembly of regions, > can cause a pmem region probe requiring the not yet available > cxl_nvd. Adjust the sequence so this dependency is met. > This requires adding a port parameter to cxl_find_nvdimm_bridge() > that can be used to query the ancestor root port. The endpoint > port is not yet available, but will share a common ancestor with > it's parent, so start the query from there instead." > Sure, will change it in next version, thanks. >> >> >>> >>> Other than that looks correct to me. >>> >>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >>> >>> >>>> >>>> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device >>>> registration, delete the workqueue") Suggested-by: Dan Williams >>>> <dan.j.williams@intel.com> Signed-off-by: Li Ming >>>> <ming4.li@intel.com> --- >>>> drivers/cxl/core/pmem.c | 15 ++++++++++----- >>>> drivers/cxl/core/region.c | 2 +- >>>> drivers/cxl/cxl.h | 4 ++-- >>>> drivers/cxl/mem.c | 17 +++++++++-------- >>>> 4 files changed, 22 insertions(+), 16 deletions(-) >>>> >>>> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c >>>> index e69625a8d6a1..31b398c13be9 100644 >>>> --- a/drivers/cxl/core/pmem.c >>>> +++ b/drivers/cxl/core/pmem.c >>>> @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device >>>> *dev, void *data) return is_cxl_nvdimm_bridge(dev); >>>> } >>>> >>>> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct >>>> cxl_memdev *cxlmd) +/** >>>> + * cxl_nvdimm_bridge() - find a bridge device relative to a port >>>> + * @port: any descendant port of an nvdimm-bridge associated >>>> + * root-cxl-port >>>> + */ >>>> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port >>>> *port) { >>>> - struct cxl_root *cxl_root __free(put_cxl_root) = >>>> - find_cxl_root(cxlmd->endpoint); >>>> + struct cxl_root *cxl_root __free(put_cxl_root) = >>>> find_cxl_root(port); >>> >>> This is a different port in the now earlier query (not the other >>> path you update). As you say any descendant is fine though. >>> I'd mention this subtle change in the patch description though. >>> (noted above) >>> >>>> struct device *dev; >>>> >>>> if (!cxl_root) >>>> @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void >>>> *_cxlmd) >>>> /** >>>> * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and >>>> an nvdimm >>>> + * @port: parent port for the (to be added) @cxlmd endpoint port >>> >>> Would calling it parent_port make more sense? >> >> Yes, will change it, thanks. >> >> >>> >>>> * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM >>>> operations * >>>> * Return: 0 on success negative error code on failure. >>>> */ >>>> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd) >>>> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev >>>> *cxlmd) { >>>> struct cxl_nvdimm_bridge *cxl_nvb; >>>> struct cxl_nvdimm *cxl_nvd; >>>> struct device *dev; >>>> int rc; >>>> >>>> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); >>>> + cxl_nvb = cxl_find_nvdimm_bridge(port); >>>> if (!cxl_nvb) >>>> return -ENODEV; >>>> >>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c >>>> index 3c2b6144be23..f0cafc7ffb45 100644 >>>> --- a/drivers/cxl/core/region.c >>>> +++ b/drivers/cxl/core/region.c >>>> @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct >>>> cxl_region *cxlr) >>>> * bridge for one device is the same for all. >>>> */ >>>> if (i == 0) { >>>> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd); >>>> + cxl_nvb = >>>> cxl_find_nvdimm_bridge(cxlmd->endpoint); if (!cxl_nvb) >>>> return -ENODEV; >>>> cxlr->cxl_nvb = cxl_nvb; >>>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >>>> index 603c0120cff8..e8fca6c6952b 100644 >>>> --- a/drivers/cxl/cxl.h >>>> +++ b/drivers/cxl/cxl.h >>>> @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge >>>> *devm_cxl_add_nvdimm_bridge(struct device *host, struct cxl_nvdimm >>>> *to_cxl_nvdimm(struct device *dev); bool is_cxl_nvdimm(struct >>>> device *dev); bool is_cxl_nvdimm_bridge(struct device *dev); >>>> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd); >>>> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct >>>> cxl_memdev *cxlmd); +int devm_cxl_add_nvdimm(struct cxl_port >>>> *port, struct cxl_memdev *cxlmd); +struct cxl_nvdimm_bridge >>>> *cxl_find_nvdimm_bridge(struct cxl_port *port); >>>> #ifdef CONFIG_CXL_REGION >>>> bool is_cxl_pmem_region(struct device *dev); >>>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c >>>> index 0c79d9ce877c..2f1b49bfe162 100644 >>>> --- a/drivers/cxl/mem.c >>>> +++ b/drivers/cxl/mem.c >>>> @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev) >>>> return -ENXIO; >>>> } >>>> >>>> + if (resource_size(&cxlds->pmem_res) && >>>> IS_ENABLED(CONFIG_CXL_PMEM)) { >>>> + rc = devm_cxl_add_nvdimm(parent_port, cxlmd); >>>> + if (rc) { >>>> + if (rc == -ENODEV) >>>> + dev_info(dev, "PMEM disabled by >>>> platform\n"); >>>> + return rc; >>>> + } >>>> + } >>>> + >>>> if (dport->rch) >>>> endpoint_parent = parent_port->uport_dev; >>>> else >>>> @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev) >>>> if (rc) >>>> return rc; >>>> >>>> - if (resource_size(&cxlds->pmem_res) && >>>> IS_ENABLED(CONFIG_CXL_PMEM)) { >>>> - rc = devm_cxl_add_nvdimm(cxlmd); >>>> - if (rc == -ENODEV) >>>> - dev_info(dev, "PMEM disabled by >>>> platform\n"); >>>> - else >>>> - return rc; >>>> - } >>>> - >>>> /* >>>> * The kernel may be operating out of CXL memory on this >>>> device, >>>> * there is no spec defined way to determine whether this >>>> device >>> >> >> > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-05-31 7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming 2024-05-31 13:00 ` kernel test robot 2024-06-05 11:57 ` Jonathan Cameron @ 2024-06-05 15:42 ` Alison Schofield 2024-06-06 2:35 ` Li, Ming 2 siblings, 1 reply; 8+ messages in thread From: Alison Schofield @ 2024-06-05 15:42 UTC (permalink / raw) To: Li Ming; +Cc: linux-cxl, dan.j.williams On Fri, May 31, 2024 at 03:02:29PM +0800, Li Ming wrote: > When CXL subsystem is auto-assembling a pmem region during cxl > endpoint port probing, always output below calltrace. > > BUG: kernel NULL pointer dereference, address: 0000000000000078 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > Call Trace: > <TASK> > ? __die+0x24/0x70 > ? page_fault_oops+0x82/0x160 > ? do_user_addr_fault+0x65/0x6b0 > ? exc_page_fault+0x7d/0x170 > ? asm_exc_page_fault+0x26/0x30 > ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] > ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] > cxl_bus_probe+0x1b/0x60 [cxl_core] > really_probe+0x173/0x410 > ? __pfx___device_attach_driver+0x10/0x10 > __driver_probe_device+0x80/0x170 > driver_probe_device+0x1e/0x90 > __device_attach_driver+0x90/0x120 > bus_for_each_drv+0x84/0xe0 > __device_attach+0xbc/0x1f0 > bus_probe_device+0x90/0xa0 > device_add+0x51c/0x710 > devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] > cxl_bus_probe+0x1b/0x60 [cxl_core] > > Because the cxl_nvd of the memdev is necessary during pmem region > probing, but the cxl_nvd can be registered only after endpoint port > probing done, that is a collision dependency, so adjust the sequence > between cxl_nvd registration and endpoint port registration to guarantee > there is a cxl_nvd in memdev during the pmem region auto-assembling. > > Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Li Ming <ming4.li@intel.com> I have a WIP cxl unit test for this case - auto assembly of pmem regions. It's far enough along that I'll offer the tested by tag, but not far enough along to post for upstream review. Tested-by: Alison Schofield <alison.schofield@intel.com> > --- > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing 2024-06-05 15:42 ` Alison Schofield @ 2024-06-06 2:35 ` Li, Ming 0 siblings, 0 replies; 8+ messages in thread From: Li, Ming @ 2024-06-06 2:35 UTC (permalink / raw) To: Alison Schofield; +Cc: linux-cxl, dan.j.williams On 6/5/2024 11:42 PM, Alison Schofield wrote: > On Fri, May 31, 2024 at 03:02:29PM +0800, Li Ming wrote: >> When CXL subsystem is auto-assembling a pmem region during cxl >> endpoint port probing, always output below calltrace. >> >> BUG: kernel NULL pointer dereference, address: 0000000000000078 >> #PF: supervisor read access in kernel mode >> #PF: error_code(0x0000) - not-present page >> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >> Call Trace: >> <TASK> >> ? __die+0x24/0x70 >> ? page_fault_oops+0x82/0x160 >> ? do_user_addr_fault+0x65/0x6b0 >> ? exc_page_fault+0x7d/0x170 >> ? asm_exc_page_fault+0x26/0x30 >> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] >> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] >> cxl_bus_probe+0x1b/0x60 [cxl_core] >> really_probe+0x173/0x410 >> ? __pfx___device_attach_driver+0x10/0x10 >> __driver_probe_device+0x80/0x170 >> driver_probe_device+0x1e/0x90 >> __device_attach_driver+0x90/0x120 >> bus_for_each_drv+0x84/0xe0 >> __device_attach+0xbc/0x1f0 >> bus_probe_device+0x90/0xa0 >> device_add+0x51c/0x710 >> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] >> cxl_bus_probe+0x1b/0x60 [cxl_core] >> >> Because the cxl_nvd of the memdev is necessary during pmem region >> probing, but the cxl_nvd can be registered only after endpoint port >> probing done, that is a collision dependency, so adjust the sequence >> between cxl_nvd registration and endpoint port registration to guarantee >> there is a cxl_nvd in memdev during the pmem region auto-assembling. >> >> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") >> Suggested-by: Dan Williams <dan.j.williams@intel.com> >> Signed-off-by: Li Ming <ming4.li@intel.com> > > > I have a WIP cxl unit test for this case - auto assembly of pmem > regions. It's far enough along that I'll offer the tested by tag, > but not far enough along to post for upstream review. > > Tested-by: Alison Schofield <alison.schofield@intel.com> Hi Alison, Thanks for your test. > >> --- >> >> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-07 1:29 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-31 7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming 2024-05-31 13:00 ` kernel test robot 2024-06-05 11:57 ` Jonathan Cameron 2024-06-06 2:30 ` Li, Ming 2024-06-06 14:09 ` Jonathan Cameron 2024-06-07 1:29 ` Li, Ming 2024-06-05 15:42 ` Alison Schofield 2024-06-06 2:35 ` Li, Ming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox