From: "Li, Ming" <ming4.li@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: <linux-cxl@vger.kernel.org>, <dan.j.williams@intel.com>
Subject: Re: [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing
Date: Fri, 7 Jun 2024 09:29:11 +0800 [thread overview]
Message-ID: <56dd0615-b845-4aec-aaaf-05a17807188f@intel.com> (raw)
In-Reply-To: <20240606150903.00005730@Huawei.com>
On 6/6/2024 10:09 PM, Jonathan Cameron wrote:
> On Thu, 6 Jun 2024 10:30:01 +0800
> "Li, Ming" <ming4.li@intel.com> wrote:
>
>> On 6/5/2024 7:57 PM, Jonathan Cameron wrote:
>>> On Fri, 31 May 2024 15:02:29 +0800
>>> Li Ming <ming4.li@intel.com> wrote:
>>>
>>>> When CXL subsystem is auto-assembling a pmem region during cxl
>>>> endpoint port probing, always output below calltrace.
>>>>
>>>> BUG: kernel NULL pointer dereference, address: 0000000000000078
>>>> #PF: supervisor read access in kernel mode
>>>> #PF: error_code(0x0000) - not-present page
>>>> RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
>>>> Call Trace:
>>>> <TASK>
>>>> ? __die+0x24/0x70
>>>> ? page_fault_oops+0x82/0x160
>>>> ? do_user_addr_fault+0x65/0x6b0
>>>> ? exc_page_fault+0x7d/0x170
>>>> ? asm_exc_page_fault+0x26/0x30
>>>> ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
>>>> ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
>>>> cxl_bus_probe+0x1b/0x60 [cxl_core]
>>>> really_probe+0x173/0x410
>>>> ? __pfx___device_attach_driver+0x10/0x10
>>>> __driver_probe_device+0x80/0x170
>>>> driver_probe_device+0x1e/0x90
>>>> __device_attach_driver+0x90/0x120
>>>> bus_for_each_drv+0x84/0xe0
>>>> __device_attach+0xbc/0x1f0
>>>> bus_probe_device+0x90/0xa0
>>>> device_add+0x51c/0x710
>>>> devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
>>>> cxl_bus_probe+0x1b/0x60 [cxl_core]
>>>>
>>>> Because the cxl_nvd of the memdev is necessary during pmem region
>>>> probing, but the cxl_nvd can be registered only after endpoint port
>>>> probing done, that is a collision dependency, so adjust the sequence
>>>> between cxl_nvd registration and endpoint port registration to guarantee
>>>> there is a cxl_nvd in memdev during the pmem region auto-assembling.
>>>
>>> Perhaps call out that the root above a parent port is the same as the root
>>> above the endpoint seeing as I think you are starting the search from
>>> a different location after this change.
>>
>> Hi Jonathan,
>>
>> Thanks for your review.
>> what do you think if I change the description as below:
>>
>> "Because the cxl_nvd of the memdev is necessary during pmem region
>> probing, but the cxl_nvd can be registered only after endpoint port
>> probing done, that is a collision dependency, so adjust the sequence
>> between cxl_nvd registration and endpoint port registration to
>> guarantee there is a cxl_nvd in memdev during the pmem region
>> auto-assembling. For that, change cxl_find_nvdimm_bridge() to use a
>> port to query the ancestor root port, it helps to find the root port
>> of an endpoint by using an endpoint's parent port so that cxl_nvd
>> registration can be finished before the endpoint attached to the CXL
>> topology"
>
> Perhaps this wording is clearer?
>
> "The cxl_nvd of the memdev needs to be available during pmem region
> probe. Currently the cxl_nvd is registered after the end port
> probe. The end point prove, in the case of autoassembly of regions,
> can cause a pmem region probe requiring the not yet available
> cxl_nvd. Adjust the sequence so this dependency is met.
> This requires adding a port parameter to cxl_find_nvdimm_bridge()
> that can be used to query the ancestor root port. The endpoint
> port is not yet available, but will share a common ancestor with
> it's parent, so start the query from there instead."
>
Sure, will change it in next version, thanks.
>>
>>
>>>
>>> Other than that looks correct to me.
>>>
>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>
>>>
>>>>
>>>> Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device
>>>> registration, delete the workqueue") Suggested-by: Dan Williams
>>>> <dan.j.williams@intel.com> Signed-off-by: Li Ming
>>>> <ming4.li@intel.com> ---
>>>> drivers/cxl/core/pmem.c | 15 ++++++++++-----
>>>> drivers/cxl/core/region.c | 2 +-
>>>> drivers/cxl/cxl.h | 4 ++--
>>>> drivers/cxl/mem.c | 17 +++++++++--------
>>>> 4 files changed, 22 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
>>>> index e69625a8d6a1..31b398c13be9 100644
>>>> --- a/drivers/cxl/core/pmem.c
>>>> +++ b/drivers/cxl/core/pmem.c
>>>> @@ -62,10 +62,14 @@ static int match_nvdimm_bridge(struct device
>>>> *dev, void *data) return is_cxl_nvdimm_bridge(dev);
>>>> }
>>>>
>>>> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct
>>>> cxl_memdev *cxlmd) +/**
>>>> + * cxl_nvdimm_bridge() - find a bridge device relative to a port
>>>> + * @port: any descendant port of an nvdimm-bridge associated
>>>> + * root-cxl-port
>>>> + */
>>>> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port
>>>> *port) {
>>>> - struct cxl_root *cxl_root __free(put_cxl_root) =
>>>> - find_cxl_root(cxlmd->endpoint);
>>>> + struct cxl_root *cxl_root __free(put_cxl_root) =
>>>> find_cxl_root(port);
>>>
>>> This is a different port in the now earlier query (not the other
>>> path you update). As you say any descendant is fine though.
>>> I'd mention this subtle change in the patch description though.
>>> (noted above)
>>>
>>>> struct device *dev;
>>>>
>>>> if (!cxl_root)
>>>> @@ -242,18 +246,19 @@ static void cxlmd_release_nvdimm(void
>>>> *_cxlmd)
>>>> /**
>>>> * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and
>>>> an nvdimm
>>>> + * @port: parent port for the (to be added) @cxlmd endpoint port
>>>
>>> Would calling it parent_port make more sense?
>>
>> Yes, will change it, thanks.
>>
>>
>>>
>>>> * @cxlmd: cxl_memdev instance that will perform LIBNVDIMM
>>>> operations *
>>>> * Return: 0 on success negative error code on failure.
>>>> */
>>>> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd)
>>>> +int devm_cxl_add_nvdimm(struct cxl_port *port, struct cxl_memdev
>>>> *cxlmd) {
>>>> struct cxl_nvdimm_bridge *cxl_nvb;
>>>> struct cxl_nvdimm *cxl_nvd;
>>>> struct device *dev;
>>>> int rc;
>>>>
>>>> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
>>>> + cxl_nvb = cxl_find_nvdimm_bridge(port);
>>>> if (!cxl_nvb)
>>>> return -ENODEV;
>>>>
>>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>>> index 3c2b6144be23..f0cafc7ffb45 100644
>>>> --- a/drivers/cxl/core/region.c
>>>> +++ b/drivers/cxl/core/region.c
>>>> @@ -2847,7 +2847,7 @@ static int cxl_pmem_region_alloc(struct
>>>> cxl_region *cxlr)
>>>> * bridge for one device is the same for all.
>>>> */
>>>> if (i == 0) {
>>>> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd);
>>>> + cxl_nvb =
>>>> cxl_find_nvdimm_bridge(cxlmd->endpoint); if (!cxl_nvb)
>>>> return -ENODEV;
>>>> cxlr->cxl_nvb = cxl_nvb;
>>>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>>>> index 603c0120cff8..e8fca6c6952b 100644
>>>> --- a/drivers/cxl/cxl.h
>>>> +++ b/drivers/cxl/cxl.h
>>>> @@ -855,8 +855,8 @@ struct cxl_nvdimm_bridge
>>>> *devm_cxl_add_nvdimm_bridge(struct device *host, struct cxl_nvdimm
>>>> *to_cxl_nvdimm(struct device *dev); bool is_cxl_nvdimm(struct
>>>> device *dev); bool is_cxl_nvdimm_bridge(struct device *dev);
>>>> -int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
>>>> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct
>>>> cxl_memdev *cxlmd); +int devm_cxl_add_nvdimm(struct cxl_port
>>>> *port, struct cxl_memdev *cxlmd); +struct cxl_nvdimm_bridge
>>>> *cxl_find_nvdimm_bridge(struct cxl_port *port);
>>>> #ifdef CONFIG_CXL_REGION
>>>> bool is_cxl_pmem_region(struct device *dev);
>>>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>>>> index 0c79d9ce877c..2f1b49bfe162 100644
>>>> --- a/drivers/cxl/mem.c
>>>> +++ b/drivers/cxl/mem.c
>>>> @@ -152,6 +152,15 @@ static int cxl_mem_probe(struct device *dev)
>>>> return -ENXIO;
>>>> }
>>>>
>>>> + if (resource_size(&cxlds->pmem_res) &&
>>>> IS_ENABLED(CONFIG_CXL_PMEM)) {
>>>> + rc = devm_cxl_add_nvdimm(parent_port, cxlmd);
>>>> + if (rc) {
>>>> + if (rc == -ENODEV)
>>>> + dev_info(dev, "PMEM disabled by
>>>> platform\n");
>>>> + return rc;
>>>> + }
>>>> + }
>>>> +
>>>> if (dport->rch)
>>>> endpoint_parent = parent_port->uport_dev;
>>>> else
>>>> @@ -174,14 +183,6 @@ static int cxl_mem_probe(struct device *dev)
>>>> if (rc)
>>>> return rc;
>>>>
>>>> - if (resource_size(&cxlds->pmem_res) &&
>>>> IS_ENABLED(CONFIG_CXL_PMEM)) {
>>>> - rc = devm_cxl_add_nvdimm(cxlmd);
>>>> - if (rc == -ENODEV)
>>>> - dev_info(dev, "PMEM disabled by
>>>> platform\n");
>>>> - else
>>>> - return rc;
>>>> - }
>>>> -
>>>> /*
>>>> * The kernel may be operating out of CXL memory on this
>>>> device,
>>>> * there is no spec defined way to determine whether this
>>>> device
>>>
>>
>>
>
next prev parent reply other threads:[~2024-06-07 1:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-31 7:02 [PATCH 1/1] cxl/mem: Fix no cxl_nvd during pmem region auto-assembing Li Ming
2024-05-31 13:00 ` kernel test robot
2024-06-05 11:57 ` Jonathan Cameron
2024-06-06 2:30 ` Li, Ming
2024-06-06 14:09 ` Jonathan Cameron
2024-06-07 1:29 ` Li, Ming [this message]
2024-06-05 15:42 ` Alison Schofield
2024-06-06 2:35 ` Li, Ming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56dd0615-b845-4aec-aaaf-05a17807188f@intel.com \
--to=ming4.li@intel.com \
--cc=Jonathan.Cameron@Huawei.com \
--cc=dan.j.williams@intel.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox