Linux CXL
 help / color / mirror / Atom feed
From: Alejandro Lucero Palau <alucerop@amd.com>
To: "Dan Williams (nvidia)" <djbw@kernel.org>,
	alejandro.lucero-palau@amd.com, linux-cxl@vger.kernel.org,
	edward.cree@amd.com, davem@davemloft.net, kuba@kernel.org,
	pabeni@redhat.com, edumazet@google.com, dave.jiang@intel.com
Subject: Re: [PATCH v26 6/8] cxl: attach region to an accelerator/type2 memdev
Date: Tue, 5 May 2026 21:51:26 +0100	[thread overview]
Message-ID: <9aaf07a6-ffc9-4474-835c-bbc6ddf84472@amd.com> (raw)
In-Reply-To: <69f54967edd72_3291a9100c3@djbw-dev.notmuch>


On 5/2/26 01:46, Dan Williams (nvidia) wrote:
> Alejandro Lucero Palau wrote:
> [..]
>>>> +	}
>>> A couple problems here.
>>>
>>> 1/ Nothing stops a CXL class device from implementing a decoder with
>>>      CXL_DECODER_DEVMEM (HDM-DB).
>> Uhmmm...
> Consider a helpful expander that does not require the host OS to use
> cpu_cache_invalidate_memregion() whenever DPA space is changed. I
> imagine that would be useful for DCD where there could be a higher
> frequency of extent changes.


What do you want here then? We had a param for specifically asking for 
creating a dax device for the region, but it was considered unnecessary, 
and I think it was you then. Should I put it back?


>>>> +	/* hold endpoint lock to setup autoremove of the region */
>>>> +	guard(device)(&endpoint->dev);
>>> This does not handle the case when ->endpoint is an ERR_PTR() because
>>> the memdev never attached in the first instance.
>> Not sure about this but, is it not the success of devm_cxl_add_memdev()
>> ensuring this can not happen?
> That is only ensured by using the "attach" mechanism.
> devm_cxl_add_memdev(..., NULL) is only for the generic memory expander
> case. Where the entire usage model is governed by memdev ABIs.
>

Is your concern that the memdev probe could not happen synchronously so 
a further call like the one implemented here could fail due to the 
endpoint not there yet?


If so, I do not think that is possible with cxl_mem driver using 
PROBE_FORCE_SYNCHRONOUS. It can happen for the case of the device 
connected to a cxl switch port which has not been initialised yet, as 
you described to me in the LPC when I asked for a case for needing the 
memdev attach call.


>> Note I decouple region attach from memdev creation. No memdev, no call
>> to this function.
> Which leads to this problem.


If I am right with my previous assertion, which problem do you refer to 
here?


>
>>>> +/* Called at driver exit or when user space triggers cxl region removal. */
>>>> +static void efx_cxl_unmap_region(void *data) {
>>>> +	struct efx_probe_data *probe_data = data;
>>>> +
>>>> +	probe_data->cxl_pio_initialised = false;
>>>> +	iounmap(probe_data->cxl->ctpio_cxl);
>>>> +}
>>> I do not see how an async event can safely zap that ctpio_cxl space with
>>> zero coordination with the driver, and I do not think you want to burden
>>> the fast path with new locks to coordinate this.
>> You should look patch 8 where your concern is hopefully tackled. I need
>> to test this further, but there is no need of additional locks.
> Please do not structure patches with bugs in the middle of the series.
> It burns reviewer resources.


Could you be more specific about the bug?


I'm introducing a detach callback here which will do things necessary up 
to this point/patch in the driver functionality. In patch 8 the detach 
code is extended to unwind the use of ctpio buffers which are not used 
until that same patch.


>>> Can we please stick with the violent but simple "unload driver" approach
>>> for now? Someone removing cxl_acpi, disabling port drivers, or disabling
>>> the cxl_mem driver gets to keep all the pieces. Just like force
>>> unloading your storage driver underneath your root filesystem. Do not do
>>> it unless you want to see the fireworks or test various hotplug flows.
>>
>> I have to disagree. The use of CXL is only part of the datapath. The
>> driver can keep going without CXL. The related buffers can be used until
>> the sync between the efx_ef10_disable_piobufs() added in patch 8 and the
>> CXL CTPIO datapath.
>>
>>
>> Although the option of unloading the driver is possible, I do not think
>> CXL should decide what to do here when there exists another option.
> The concern is creeping complexity. There is no such concept in existing
> Linux drivers where an MMIO mapping disappears out from underneath a
> running driver. It may start returning all 0xff and fire an error
> interrupt, but the driver does not need to worry about responding to
> async unmap.
>
> All drivers must already be prepared to be unloaded. So we start with
> the simple semantic first to get this functionality landed and then
> think about adding sophistication like live fallback to PCI operation.


Ok. Let's try this one. You want to trigger device_release_driver or 
something similar on the pci_dev->dev linked to the memdev. Right?

Do we have this support now? If we do not, have you evaluated the 
complexity required for ensuring no deadlocks if this is triggered while 
the sfc driver is still probing?

Maybe I'm overthinking this option, so if you have a clear idea about 
how to do this, please tell me, assuming it is a matter of calling such 
a pci_dev "unbinding" from its current driver.



  reply	other threads:[~2026-05-05 20:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 18:05 [PATCH v26 0/8] Type2 device basic support alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 1/8] sfc: add cxl support alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin
2026-05-01 10:07     ` Alejandro Lucero Palau
2026-04-23 18:05 ` [PATCH v26 2/8] cxl/sfc: Map cxl regs alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 3/8] cxl/sfc: Initialize dpa without a mailbox alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 4/8] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2026-04-30 23:23   ` Dan Williams (nvidia)
2026-04-23 18:05 ` [PATCH v26 5/8] sfc: create type2 cxl memdev alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 6/8] cxl: attach region to an accelerator/type2 memdev alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin
2026-05-01 10:35     ` Alejandro Lucero Palau
2026-05-01  2:00   ` Dan Williams (nvidia)
2026-05-01 10:59     ` Alejandro Lucero Palau
2026-05-02  0:46       ` Dan Williams (nvidia)
2026-05-05 20:51         ` Alejandro Lucero Palau [this message]
2026-04-23 18:05 ` [PATCH v26 7/8] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin
2026-04-23 18:05 ` [PATCH v26 8/8] sfc: support pio mapping based on cxl alejandro.lucero-palau
2026-04-23 22:07 ` [PATCH v26 0/8] Type2 device basic support Dave Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9aaf07a6-ffc9-4474-835c-bbc6ddf84472@amd.com \
    --to=alucerop@amd.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=djbw@kernel.org \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox