public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Lucero Palau <alucerop@amd.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	alejandro.lucero-palau@amd.com
Cc: linux-cxl@vger.kernel.org, netdev@vger.kernel.org,
	dan.j.williams@intel.com, martin.habets@xilinx.com,
	edward.cree@amd.com, davem@davemloft.net, kuba@kernel.org,
	pabeni@redhat.com, edumazet@google.com, richard.hughes@amd.com
Subject: Re: [PATCH v2 08/15] cxl: indicate probe deferral
Date: Mon, 19 Aug 2024 14:54:00 +0100	[thread overview]
Message-ID: <00272299-90d0-6e0c-8a35-dae8fa3ef03a@amd.com> (raw)
In-Reply-To: <20240804184135.00001666@Huawei.com>


On 8/4/24 18:41, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:28 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> The first stop for a CXL accelerator driver that wants to establish new
>> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
>> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
>> topology up to the root.
>>
>> If the root driver has not attached yet the expectation is that the
>> driver waits until that link is established. The common cxl_pci_driver
>> has reason to keep the 'struct cxl_memdev' device attached to the bus
>> until the root driver attaches. An accelerator may want to instead defer
>> probing until CXL resources can be acquired.
>>
>> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
>> accelerator driver probing should be defferred vs failed. Provide that
>> indication via a new cxl_acquire_endpoint() API that can retrieve the
>> probe status of the memdev.
>>
>> The first consumer of this API is a test driver that excercises the CXL
> Spell check.
> exercises


I'll fix it along with step instead of stop in the first line.


>> Type-2 flow.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>>   drivers/cxl/core/port.c            |  2 +-
>>   drivers/cxl/mem.c                  |  7 +++--
>>   drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>>   include/linux/cxl_accel_mem.h      |  3 +++
>>   5 files changed, 59 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index b902948b121f..d51c8bfb32e3 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   }
>>   EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>>   
>> +/*
>> + * Try to get a locked reference on a memdev's CXL port topology
>> + * connection. Be careful to observe when cxl_mem_probe() has deposited
>> + * a probe deferral awaiting the arrival of the CXL root driver
> It might have deposited an error that isn't deferral I think.
> I would be careful to make that clear in this comment.


Yes. The situation this patch is dealing with is not easy to handle. I 
realize the accel driver needs to be aware of it what the sfc code does 
not handle.

I need to work on this starting with emulating the situation and maybe 
adding the work as a test ... where we need some emulated Type2 device. 
Dan was asking about some work done before my initial RFC where Type2 
support in qemu was the target, maybe something we can talk about in the 
LPC.


>> +*/
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
>> +{
>> +	struct cxl_port *endpoint;
>> +	int rc = -ENXIO;
>> +
>> +	device_lock(&cxlmd->dev);
> I'd not really expect an 'acquire endpoint' to exit
> in the good path with the cxlmd->dev device lock held.
> Perhaps that needs a bit more shouting in the naming of
> the function?


Uhmm, not clear to me at this point if that is needed. This is basically 
the original patch by Dan so as said above, I need to work on this a bit 
further.

I'll try to get this sorted out for v3.

Thanks


>> +	endpoint = cxlmd->endpoint;
>> +	if (!endpoint)
>> +		goto err;
>> +
>> +	if (IS_ERR(endpoint)) {
>> +		rc = PTR_ERR(endpoint);
>> +		goto err;
>> +	}
>> +
>> +	device_lock(&endpoint->dev);
>> +	if (!endpoint->dev.driver)
>> +		goto err_endpoint;
>> +
>> +	return endpoint;
>> +
>> +err_endpoint:
>> +	device_unlock(&endpoint->dev);
>> +err:
>> +	device_unlock(&cxlmd->dev);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
>> +
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
>> +{
>> +	device_unlock(&endpoint->dev);
>> +	device_unlock(&cxlmd->dev);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
>> +
>>   static void sanitize_teardown_notifier(void *data)
>>   {
>>   	struct cxl_memdev_state *mds = data;
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index d66c6349ed2d..3c6b896c5f65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>   		 */
>>   		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>   			dev_name(dport_dev));
>> -		return -ENXIO;
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	parent_port = find_cxl_port(dparent, &parent_dport);
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index f76af75a87b7..383a6f4829d3 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>>   		return rc;
>>   
>>   	rc = devm_cxl_enumerate_ports(cxlmd);
>> -	if (rc)
>> +	if (rc) {
>> +		cxlmd->endpoint = ERR_PTR(rc);
>>   		return rc;
>> +	}
>>   
>>   	parent_port = cxl_mem_find_port(cxlmd, &dport);
>>   	if (!parent_port) {
>>   		dev_err(dev, "CXL port topology not found\n");
> Hmm. This seems excessive error print for a deferred path.
>
>> -		return -ENXIO;
>> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {

  reply	other threads:[~2024-08-19 13:54 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
2024-07-15 18:48   ` Andrew Lunn
2024-07-16  8:50     ` Alejandro Lucero Palau
2024-07-16  1:57   ` kernel test robot
2024-07-18 23:12   ` Dave Jiang
2024-07-19  6:03     ` Alejandro Lucero Palau
2024-08-04 16:44       ` Jonathan Cameron
2024-08-09  7:26         ` Alejandro Lucero Palau
2024-08-04 17:10   ` Jonathan Cameron
2024-08-12 11:16     ` Alejandro Lucero Palau
2024-08-13  8:30       ` Alejandro Lucero Palau
2024-08-15 16:38         ` Jonathan Cameron
2024-08-19 11:12           ` Alejandro Lucero Palau
2024-08-20 10:44             ` Alejandro Lucero Palau
2024-08-15 16:35       ` Jonathan Cameron
2024-08-19 11:10         ` Alejandro Lucero Palau
2024-08-27 15:06           ` Jonathan Cameron
2024-08-09  8:34   ` Zhi Wang
2024-08-12 11:34     ` Alejandro Lucero Palau
2024-08-17 20:32       ` Zhi Wang
2024-08-19 11:13         ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
2024-07-16  6:26   ` Li, Ming4
2024-08-14  7:46     ` Alejandro Lucero Palau
2024-07-18 23:27   ` Dave Jiang
2024-08-14  7:49     ` Alejandro Lucero Palau
2024-08-04 17:15   ` Jonathan Cameron
2024-08-14  7:56     ` Alejandro Lucero Palau
2024-08-15 16:40       ` Jonathan Cameron
2024-08-18  8:07         ` Zhi Wang
2024-08-19 11:28           ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
2024-07-18 23:36   ` Dave Jiang
2024-08-04 17:16     ` Jonathan Cameron
2024-08-14  8:08       ` Alejandro Lucero Palau
2024-08-14  8:00     ` Alejandro Lucero Palau
2024-08-09  9:01   ` Zhi Wang
2024-08-22 13:07   ` Zhi Wang
2024-08-23  9:30     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
2024-07-19 19:01   ` Dave Jiang
2024-07-23 13:43     ` Alejandro Lucero Palau
2024-08-09 10:25       ` Zhi Wang
2024-08-15 15:37         ` Alejandro Lucero Palau
2024-08-18  6:55           ` Zhi Wang
2024-08-19 13:14             ` Alejandro Lucero Palau
2024-08-04 17:22   ` Jonathan Cameron
2024-08-15 15:43     ` Alejandro Lucero Palau
2024-08-09  9:10   ` Zhi Wang
2024-08-15 15:20     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
2024-07-24 21:25   ` fan
2024-08-16 14:43     ` Alejandro Lucero Palau
2024-08-04 17:25   ` Jonathan Cameron
2024-08-16 14:37     ` Alejandro Lucero Palau
2024-08-27 15:12       ` Jonathan Cameron
2024-08-09  9:14   ` Zhi Wang
2024-08-16 14:42     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator alejandro.lucero-palau
2024-08-04 17:26   ` Jonathan Cameron
2024-08-16 14:54     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
2024-07-24 21:32   ` fan
2024-08-16 14:57     ` Alejandro Lucero Palau
2024-08-04 17:31   ` Jonathan Cameron
2024-08-16 15:00     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
2024-07-16  5:52   ` Li, Ming4
2024-07-16  8:10     ` Alejandro Lucero Palau
2024-07-30 16:43   ` Fan Ni
2024-08-04 17:41   ` Jonathan Cameron
2024-08-19 13:54     ` Alejandro Lucero Palau [this message]
2024-08-09 14:40   ` Zhi Wang
2024-08-26 17:42   ` Zhi Wang
2024-08-28 13:43     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
2024-07-16  0:53   ` kernel test robot
2024-07-16  6:06   ` Li, Ming4
2024-07-24  8:24     ` Alejandro Lucero Palau
2024-07-25  5:51       ` Li, Ming4
2024-07-25 11:59         ` Alejandro Lucero Palau
2024-08-04 17:57   ` Jonathan Cameron
2024-08-19 14:47     ` Alejandro Lucero Palau
2024-08-27 15:18       ` Jonathan Cameron
2024-08-28 10:18     ` Alejandro Lucero Palau
2024-08-28 11:19       ` Jonathan Cameron
2024-08-28 10:41     ` Alejandro Lucero Palau
2024-08-28 11:26       ` Jonathan Cameron
2024-08-28 13:08         ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
2024-07-16  3:32   ` kernel test robot
2024-08-04 18:07   ` Jonathan Cameron
2024-08-19 15:52     ` Alejandro Lucero Palau
2024-08-06 17:33   ` Fan Ni
2024-08-19 15:57     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 11/15] cxl: make region type based on endpoint type alejandro.lucero-palau
2024-07-16  7:14   ` Li, Ming4
2024-07-16  8:13     ` Alejandro Lucero Palau
2024-08-28 16:06       ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
2024-08-04 18:29   ` Jonathan Cameron
2024-08-19 16:11     ` Alejandro Lucero Palau
2024-08-22 13:12   ` Zhi Wang
2024-08-23  9:31     ` Alejandro Lucero Palau
2024-08-27 15:20       ` Jonathan Cameron
2024-07-15 17:28 ` [PATCH v2 13/15] cxl: preclude device memory to be used for dax alejandro.lucero-palau
2024-07-15 17:28 ` [PATCH v2 14/15] cxl: add function for obtaining params from a region alejandro.lucero-palau
2024-08-09 15:24   ` Zhi Wang
2024-08-19 16:14     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 15/15] efx: support pio mapping based on cxl alejandro.lucero-palau
2024-08-04 18:13   ` Jonathan Cameron
2024-08-19 16:28     ` Alejandro Lucero Palau
2024-08-27 15:23       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=00272299-90d0-6e0c-8a35-dae8fa3ef03a@amd.com \
    --to=alucerop@amd.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=martin.habets@xilinx.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richard.hughes@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox