Linux CXL
 help / color / mirror / Atom feed
From: Neeraj Kumar <s.neeraj@samsung.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: dan.j.williams@intel.com, dave@stgolabs.net,
	jonathan.cameron@huawei.com, alison.schofield@intel.com,
	vishal.l.verma@intel.com, ira.weiny@intel.com,
	a.manzanares@samsung.com, nifan.cxl@gmail.com,
	anisa.su@samsung.com, vishak.g@samsung.com,
	krish.reddy@samsung.com, arun.george@samsung.com,
	alok.rathore@samsung.com, neeraj.kernel@gmail.com,
	linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev, gost.dev@samsung.com, cpgs@samsung.com
Subject: Re: [RFC PATCH 13/20] cxl/mem: Refactor cxl pmem region auto-assembling
Date: Fri, 18 Jul 2025 18:00:50 +0530	[thread overview]
Message-ID: <1931444790.41752909482841.JavaMail.epsvc@epcpadp2new> (raw)
In-Reply-To: <3670eb1d-eaf5-4b8b-b3fe-1190724ee7d7@intel.com>

[-- Attachment #1: Type: text/plain, Size: 5401 bytes --]

On 09/07/25 05:38PM, Dave Jiang wrote:
>
>
>On 6/17/25 5:39 AM, Neeraj Kumar wrote:
>> In 84ec985944ef3, For cxl pmem region auto-assembly after endpoint port
>> probing, cxl_nvd presence was required. And for cxl region persistency,
>> region creation happens during nvdimm_probe which need the completion
>> of endpoint probe.
>>
>> It is therefore refactored cxl pmem region auto-assembly after endpoint
>> probing to cxl mem probing
>
>The region auto-assembly is moved after the endpoint device is added. However, it's not guaranteed that the endpoint probe has completed and completed successfully. You are just getting lucky timing wise that endpoint probe is completed when the new region discovery is starting.

Hi Dave,

For region auto-assembly two things are required
  1. endpoint(s) decoder should be enumerated successfully
  2. As per fix in 84ec985944ef3, The cxl_nvd of the memdev needs to be available during the pmem region probe

Prior to current patch, region auto-assembly was happening from cxl_endpoint_port_probe() after endpoint decoder enumeration.
In 84ec985944ef3, sequence of devm_cxl_add_nvdimm() was moved before devm_cxl_add_endpoint() so as to meet region auto assembly dependency on cxl_nvd of the memdev.

Here, In this patch I have moved region auto-assembly after
  1. devm_cxl_add_endpoint(): This function makes sure cxl_endpoint_port_probe() has completed successfully, as per below check in devm_cxl_add_endpoint()
       if (!endpoint->dev.driver) {
            dev_err(&cxlmd->dev, "%s failed probe\n", dev_name(&endpoint->dev));
            return -ENXIO;
       }
       And successfull completion of cxl_endpoint_port_probe(), it must have enumerated endpoint(s) decoder successfully
  2. devm_cxl_add_nvdimm(): As you rightly said, this allocates "cxl_nvd" nvdimm device and triggers the async probe of nvdimm driver

Actually in this patch, from async probe function (cxl_nvdimm_probe()), I am creating "struct nvdimm" using __nvdimm_create()
This __nvdimm_create() ultimately scans LSA. If LSA finds region label into it then it saves region information into struct nvdimm
and then using create_pmem_region(nvdimm), I am re-creating cxl region for region persistency.

As for cxl region persistency (based on LSA 2.1 scanning - this patch)
following sequence should be required
  1. devm_cxl_add_endpoint(): endpoint probe completion - which is getting done by devm_cxl_add_endpoint()
  2. devm_cxl_add_nvdimm(): Here after nvdimm device creation, cxl region is being created

It is therefore re-sequencing of region-auto assembly is required to move from cxl_endpoint_port_probe() to after
devm_cxl_add_endpoint() and devm_cxl_add_nvdimm()

>Return of devm_cxl_add_nvdimm() only adds the nvdimm device and triggers the async probe of nvdimm driver. You have to take the endpoint port lock

I think we may not require endpoint port lock as auto-assembly and region persistency code sequence is always after successful completion of endpoint probe.

>and check if the driver is attached to be certain that endpoint probe is done and successful. Therefore moving the region discovery location probably does not do what you think it does.
>Maybe take a deeper look at the region discovery code and see how it does retry if things are not present and approach it from that angle?
>

Please let me know if my understanding is correct or am I missing something?

>>
>> Signed-off-by: Neeraj Kumar <s.neeraj@samsung.com>
>> ---
>>  drivers/cxl/core/port.c | 38 ++++++++++++++++++++++++++++++++++++++
>>  drivers/cxl/cxl.h       |  1 +
>>  drivers/cxl/mem.c       | 27 ++++++++++++++++++---------
>>  drivers/cxl/port.c      | 38 --------------------------------------
>>  4 files changed, 57 insertions(+), 47 deletions(-)
>>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 78a5c2c25982..bca668193c49 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1038,6 +1038,44 @@ void put_cxl_root(struct cxl_root *cxl_root)
>>  }
>>  EXPORT_SYMBOL_NS_GPL(put_cxl_root, "CXL");
>>
>> +static int discover_region(struct device *dev, void *root)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	int rc;
>> +
>> +	if (!is_endpoint_decoder(dev))
>> +		return 0;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
>> +		return 0;
>> +
>> +	if (cxled->state != CXL_DECODER_STATE_AUTO)
>> +		return 0;
>> +
>> +	/*
>> +	 * Region enumeration is opportunistic, if this add-event fails,
>> +	 * continue to the next endpoint decoder.
>> +	 */
>> +	rc = cxl_add_to_region(root, cxled);
>> +	if (rc)
>> +		dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
>> +			cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
>> +
>> +	return 0;
>> +}
>> +
>> +void cxl_region_discovery(struct cxl_port *port)
>> +{
>> +	struct cxl_port *root;
>> +	struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
>> +
>> +	root = &cxl_root->port;
>> +
>> +	device_for_each_child(&port->dev, root, discover_region);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_region_discovery, "CXL");
>> +
>
>I have concerns about adding region related code in core/port.c while the rest of the region code is walled behind CONFIG_CXL_REGION. I think this change needs to go to core/region.c.
>
>DJ
>

Sure Dave, I will move it into core/region.c in next patch-set.

Regards,
Neeraj

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2025-07-19  7:18 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20250617124008epcas5p2e702f786645d44ceb1cdd980a914ce8e@epcas5p2.samsung.com>
     [not found] ` <20250617123944.78345-1-s.neeraj@samsung.com>
2025-06-17 12:39   ` [RFC PATCH 01/20] nvdimm/label: Introduce NDD_CXL_LABEL flag to set cxl label format Neeraj Kumar
2025-06-20 16:40     ` Jonathan Cameron
2025-06-26  9:48       ` Neeraj Kumar
2025-07-02 18:02     ` Ira Weiny
2025-07-03  9:58       ` Neeraj Kumar
2025-07-09 22:57     ` Dave Jiang
2025-07-18 12:13       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 02/20] nvdimm/label: Prep patch to accommodate cxl lsa 2.1 support Neeraj Kumar
2025-06-23 10:53     ` Jonathan Cameron
2025-06-26  9:51       ` Neeraj Kumar
2025-07-02 17:55     ` Ira Weiny
2025-07-03 10:04       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 03/20] nvdimm/namespace_label: Add namespace label changes as per CXL LSA v2.1 Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 04/20] nvdimm/label: CXL labels skip the need for 'interleave-set cookie' Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 05/20] nvdimm/region_label: Add region label updation routine Neeraj Kumar
2025-06-23  9:05     ` Jonathan Cameron
2025-06-26  9:54       ` Neeraj Kumar
2025-07-17 22:53     ` Fabio M. De Francesco
2025-07-18 13:00       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 06/20] nvdimm/region_label: Add region label deletion routine Neeraj Kumar
2025-06-23  9:09     ` Jonathan Cameron
2025-06-26  9:55       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 07/20] nvdimm/namespace_label: Update namespace init_labels and its region_uuid Neeraj Kumar
2025-06-23  9:11     ` Jonathan Cameron
2025-06-26  9:58       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 08/20] nvdimm/label: Include region label in slot validation Neeraj Kumar
2025-06-23  9:13     ` Jonathan Cameron
2025-06-26 10:00       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 09/20] nvdimm/namespace_label: Skip region label during ns label DPA reservation Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 10/20] nvdimm/region_label: Preserve cxl region information from region label Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 11/20] nvdimm/region_label: Export routine to fetch region information Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 12/20] nvdimm/namespace_label: Skip region label during namespace creation Neeraj Kumar
2025-06-23  9:17     ` Jonathan Cameron
2025-06-26 10:02       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 13/20] cxl/mem: Refactor cxl pmem region auto-assembling Neeraj Kumar
2025-06-23  9:20     ` Jonathan Cameron
2025-06-26 10:03       ` Neeraj Kumar
2025-07-10  0:38     ` Dave Jiang
2025-07-18 12:30       ` Neeraj Kumar [this message]
2025-07-21 18:11         ` Dave Jiang
2025-06-17 12:39   ` [RFC PATCH 14/20] cxl/region: Add cxl pmem region creation routine for region persistency Neeraj Kumar
2025-06-23  9:43     ` Jonathan Cameron
2025-06-26 10:23       ` Neeraj Kumar
2025-07-10 15:59     ` Dave Jiang
2025-07-18 12:45       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 15/20] cxl: Add a routine to find cxl root decoder on cxl bus Neeraj Kumar
2025-06-23  9:44     ` Jonathan Cameron
2025-06-26 10:38       ` Neeraj Kumar
2025-06-26 19:19     ` Alison Schofield
2025-06-27  9:03       ` Neeraj Kumar
2025-07-10 16:23     ` Dave Jiang
2025-07-18 12:48       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 16/20] cxl/mem: Preserve cxl root decoder during mem probe Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 17/20] cxl/pmem: Preserve region information into nd_set Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 18/20] cxl/pmem: Add support of cxl lsa 2.1 support in cxl pmem Neeraj Kumar
2025-06-23  9:48     ` Jonathan Cameron
2025-06-26 10:41       ` Neeraj Kumar
2025-07-10 17:18     ` Dave Jiang
2025-07-18 12:51       ` Neeraj Kumar
2025-07-21 17:44         ` Dave Jiang
2025-06-17 12:39   ` [RFC PATCH 19/20] cxl/pmem_region: Prep patch to accommodate pmem_region attributes Neeraj Kumar
2025-06-23  9:53     ` Jonathan Cameron
2025-06-26 10:45       ` Neeraj Kumar
2025-06-17 12:39   ` [RFC PATCH 20/20] cxl/pmem_region: Add cxl region label updation and deletion device attributes Neeraj Kumar
2025-06-23  9:56     ` Jonathan Cameron
2025-06-26 10:48       ` Neeraj Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1931444790.41752909482841.JavaMail.epsvc@epcpadp2new \
    --to=s.neeraj@samsung.com \
    --cc=a.manzanares@samsung.com \
    --cc=alison.schofield@intel.com \
    --cc=alok.rathore@samsung.com \
    --cc=anisa.su@samsung.com \
    --cc=arun.george@samsung.com \
    --cc=cpgs@samsung.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=gost.dev@samsung.com \
    --cc=ira.weiny@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=krish.reddy@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.kernel@gmail.com \
    --cc=nifan.cxl@gmail.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishak.g@samsung.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox