Linux CXL
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: "Li, Ming" <ming4.li@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <dan.j.williams@intel.com>
Subject: Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing
Date: Thu, 6 Jun 2024 15:09:03 +0100	[thread overview]
Message-ID: <20240606150903.00005730@Huawei.com> (raw)
In-Reply-To: <2e4b2584-f1f3-4e4b-a604-b2db4a28b5de@intel.com>

On Thu, 6 Jun 2024 10:30:01 +0800
"Li, Ming" <ming4.li@intel.com> wrote:

> On 6/5/2024 7:57 PM, Jonathan Cameron wrote:
> > On Fri, 31 May 2024 15:02:29 +0800
> > Li Ming <ming4.li@intel.com> wrote:
> >   
> >> When CXL subsystem is auto-assembling a pmem region during cxl
> >> endpoint port probing, always output below calltrace.
> >>
> >>  BUG: kernel NULL pointer dereference, address: 0000000000000078
> >>  #PF: supervisor read access in kernel mode
> >>  #PF: error_code(0x0000) - not-present page
> >>  RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
> >>  Call Trace:
> >>   <TASK>
> >>   ? __die+0x24/0x70
> >>   ? page_fault_oops+0x82/0x160
> >>   ? do_user_addr_fault+0x65/0x6b0
> >>   ? exc_page_fault+0x7d/0x170
> >>   ? asm_exc_page_fault+0x26/0x30
> >>   ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
> >>   ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
> >>   cxl_bus_probe+0x1b/0x60 [cxl_core]
> >>   really_probe+0x173/0x410
> >>   ? __pfx___device_attach_driver+0x10/0x10
> >>   __driver_probe_device+0x80/0x170
> >>   driver_probe_device+0x1e/0x90
> >>   __device_attach_driver+0x90/0x120
> >>   bus_for_each_drv+0x84/0xe0
> >>   __device_attach+0xbc/0x1f0
> >>   bus_probe_device+0x90/0xa0
> >>   device_add+0x51c/0x710
> >>   devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
> >>   cxl_bus_probe+0x1b/0x60 [cxl_core]
> >>
> >> Because the cxl_nvd of the memdev is necessary during pmem region
> >> probing, but the cxl_nvd can be registered only after endpoint port
> >> probing done, that is a collision dependency, so adjust the sequence
> >> between cxl_nvd registration and endpoint port registration to guarantee
> >> there is a cxl_nvd in memdev during the pmem region auto-assembling.  
> > 
> > Perhaps call out that the root above a parent port is the same as the root
> > above the endpoint seeing as I think you are starting the search from
> > a different location after this change.  
> 
> Hi Jonathan,
> 
> Thanks for your review.
> what do you think if I change the description as below:
> 
> "Because the cxl_nvd of the memdev is necessary during pmem region
> probing, but the cxl_nvd can be registered only after endpoint port
> probing done, that is a collision dependency, so adjust the sequence
> between cxl_nvd registration and endpoint port registration to
> guarantee there is a cxl_nvd in memdev during the pmem region
> auto-assembling. For that, change cxl_find_nvdimm_bridge() to use a
> port to query the ancestor root port, it helps to find the root port
> of an endpoint by using an endpoint's parent port so that cxl_nvd
> registration can be finished before the endpoint attached to the CXL
> topology"

Perhaps this wording is clearer?

"The cxl_nvd of the memdev needs to be available during pmem region
 probe. Currently the cxl_nvd is registered after the end port
 probe. The end point prove, in the case of autoassembly of regions,
 can cause a pmem region probe requiring the not yet available
 cxl_nvd.  Adjust the sequence so this dependency is met.
 This requires adding a port parameter to cxl_find_nvdimm_bridge()
 that can be used to query the ancestor root port. The endpoint
 port is not yet available, but will share a common ancestor with
 it's parent, so start the query from there instead."

> 
> 
> > 
> > Other than that looks correct to me.
> > 
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> >   
> >>
> >> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device
> >> registration, delete the workqueue") Suggested-by: Dan Williams
> >> <dan.j.williams@intel.com> Signed-off-by: Li Ming
> >> <ming4.li@intel.com> ---
> >>  drivers/cxl/core/pmem.c   | 15 ++++++++++-----
> >>  drivers/cxl/core/region.c |  2 +-
> >>  drivers/cxl/cxl.h         |  4 ++--
> >>  drivers/cxl/mem.c         | 17 +++++++++--------
> >>  4 files changed, 22 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> >> index e69625a8d6a1..31b398c13be9 100644
> >> --- a/drivers/cxl/core/pmem.c
> >> +++ b/drivers/cxl/core/pmem.c
> >> @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device
> >> *dev, void *data) return is_cxl_nvdimm_bridge(dev);
> >>  }
> >>  
> >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct
> >> cxl_memdev *cxlmd) +/**
> >> + * cxl_nvdimm_bridge() - find a bridge device relative to a port
> >> + * @port: any descendant port of an nvdimm-bridge associated
> >> + *        root-cxl-port
> >> + */
> >> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port
> >> *port) {
> >> -	struct cxl_root *cxl_root __free(put_cxl_root) =
> >> -		find_cxl_root(cxlmd->endpoint);
> >> +	struct cxl_root *cxl_root __free(put_cxl_root) =
> >> find_cxl_root(port);  
> > 
> > This is a different port in the now earlier query (not the other
> > path you update). As you say any descendant is fine though.
> > I'd mention this subtle change in the patch description though.
> > (noted above)
> >   
> >>  	struct device *dev;
> >>  
> >>  	if (!cxl_root)
> >> @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void
> >> *_cxlmd) 
> >>  /**
> >>   * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and
> >> an nvdimm
> >> + * @port: parent port for the (to be added) @cxlmd endpoint port  
> > 
> > Would calling it parent_port make more sense?  
> 
> Yes, will change it, thanks.
> 
> 
> >   
> >>   * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM
> >> operations *
> >>   * Return: 0 on success negative error code on failure.
> >>   */
> >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd)
> >> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev
> >> *cxlmd) {
> >>  	struct cxl_nvdimm_bridge *cxl_nvb;
> >>  	struct cxl_nvdimm *cxl_nvd;
> >>  	struct device *dev;
> >>  	int rc;
> >>  
> >> -	cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
> >> +	cxl_nvb = cxl_find_nvdimm_bridge(port);
> >>  	if (!cxl_nvb)
> >>  		return -ENODEV;
> >>  
> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> >> index 3c2b6144be23..f0cafc7ffb45 100644
> >> --- a/drivers/cxl/core/region.c
> >> +++ b/drivers/cxl/core/region.c
> >> @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct
> >> cxl_region *cxlr)
> >>  		 * bridge for one device is the same for all.
> >>  		 */
> >>  		if (i == 0) {
> >> -			cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
> >> +			cxl_nvb =
> >> cxl_find_nvdimm_bridge(cxlmd->endpoint); if (!cxl_nvb)
> >>  				return -ENODEV;
> >>  			cxlr->cxl_nvb = cxl_nvb;
> >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> >> index 603c0120cff8..e8fca6c6952b 100644
> >> --- a/drivers/cxl/cxl.h
> >> +++ b/drivers/cxl/cxl.h
> >> @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge
> >> *devm_cxl_add_nvdimm_bridge(struct device *host, struct cxl_nvdimm
> >> *to_cxl_nvdimm(struct device *dev); bool is_cxl_nvdimm(struct
> >> device *dev); bool is_cxl_nvdimm_bridge(struct device *dev);
> >> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
> >> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct
> >> cxl_memdev *cxlmd); +int devm_cxl_add_nvdimm(struct cxl_port
> >> *port, struct cxl_memdev *cxlmd); +struct cxl_nvdimm_bridge
> >> *cxl_find_nvdimm_bridge(struct cxl_port *port); 
> >>  #ifdef CONFIG_CXL_REGION
> >>  bool is_cxl_pmem_region(struct device *dev);
> >> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> >> index 0c79d9ce877c..2f1b49bfe162 100644
> >> --- a/drivers/cxl/mem.c
> >> +++ b/drivers/cxl/mem.c
> >> @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev)
> >>  		return -ENXIO;
> >>  	}
> >>  
> >> +	if (resource_size(&cxlds->pmem_res) &&
> >> IS_ENABLED(CONFIG_CXL_PMEM)) {
> >> +		rc = devm_cxl_add_nvdimm(parent_port, cxlmd);
> >> +		if (rc) {
> >> +			if (rc == -ENODEV)
> >> +				dev_info(dev, "PMEM disabled by
> >> platform\n");
> >> +			return rc;
> >> +		}
> >> +	}
> >> +
> >>  	if (dport->rch)
> >>  		endpoint_parent = parent_port->uport_dev;
> >>  	else
> >> @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev)
> >>  	if (rc)
> >>  		return rc;
> >>  
> >> -	if (resource_size(&cxlds->pmem_res) &&
> >> IS_ENABLED(CONFIG_CXL_PMEM)) {
> >> -		rc = devm_cxl_add_nvdimm(cxlmd);
> >> -		if (rc == -ENODEV)
> >> -			dev_info(dev, "PMEM disabled by
> >> platform\n");
> >> -		else
> >> -			return rc;
> >> -	}
> >> -
> >>  	/*
> >>  	 * The kernel may be operating out of CXL memory on this
> >> device,
> >>  	 * there is no spec defined way to determine whether this
> >> device  
> >   
> 
> 


  reply	other threads:[~2024-06-06 14:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-31  7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming
2024-05-31 13:00 ` kernel test robot
2024-06-05 11:57 ` Jonathan Cameron
2024-06-06  2:30   ` Li, Ming
2024-06-06 14:09     ` Jonathan Cameron [this message]
2024-06-07  1:29       ` Li, Ming
2024-06-05 15:42 ` Alison Schofield
2024-06-06  2:35   ` Li, Ming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240606150903.00005730@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=ming4.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox