From: Alejandro Lucero Palau <alucerop@amd.com>
To: Jonathan Cameron <jonathan.cameron@huawei.com>,
alejandro.lucero-palau@amd.com
Cc: linux-cxl@vger.kernel.org, netdev@vger.kernel.org,
dan.j.williams@intel.com, edward.cree@amd.com,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
edumazet@google.com, dave.jiang@intel.com
Subject: Re: [PATCH v20 01/22] cxl/mem: Arrange for always-synchronous memdev attach
Date: Fri, 14 Nov 2025 11:10:50 +0000 [thread overview]
Message-ID: <374f8a2c-df06-4df9-8816-d91d3236cd58@amd.com> (raw)
In-Reply-To: <20251112145341.00005b4e@huawei.com>
On 11/12/25 14:53, Jonathan Cameron wrote:
> On Mon, 10 Nov 2025 15:36:36 +0000
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Dan Williams <dan.j.williams@intel.com>
>>
>> In preparation for CXL accelerator drivers that have a hard dependency on
>> CXL capability initialization, arrange for the endpoint probe result to be
>> conveyed to the caller of devm_cxl_add_memdev().
>>
>> As it stands cxl_pci does not care about the attach state of the cxl_memdev
>> because all generic memory expansion functionality can be handled by the
>> cxl_core. For accelerators, that driver needs to know perform driver
>> specific initialization if CXL is available, or exectute a fallback to PCIe
>> only operation.
>>
>> By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module
>> loading as one reason that a memdev may not be attached upon return from
>> devm_cxl_add_memdev().
>>
>> The diff is busy as this moves cxl_memdev_alloc() down below the definition
>> of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to
>> preclude needing to export more symbols from the cxl_core.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Alejandro, read submitting patches again. Whilst the first sign off should
> indeed by Dan's this also needs one from you as a 'handler' of the patch.
>
> Be very careful checking these tag chains. If they are wrong no one can
> merge the set and it just acts as a silly blocker.
Hi Jonathan,
I did the amend but it is true I did some work on it. Would it be enough
to add my signed-off-by along with Dan's one?
> I would have split this up and made the changes to cxl_memdev_alloc in
> a precursor patch (use of __free is obvious one) then could have stated
> that that was simply moved in this patch.
OK. I think I was fixing a bug in original Dan's patch regarding cxlmd
release in case of error inside devm_cxl_add_memdev, but I think the bug
is in the current code of that function as it is not properly released
if error after a successful allocation. So splitting the patch could
allow to make this clearer and adding the Fixes tag as well.
> There are other changes in there that are really hard to spot though
> and I think there are some bugs lurking in error paths.
I did spot one after your comment, checking cxlmd pointer is not an
error pointer inside __cxlmd_free. If you spotted something else, please
tell me :-)
Thank you!
> Jonathan
>
>> ---
>> drivers/cxl/Kconfig | 2 +-
>> drivers/cxl/core/memdev.c | 101 ++++++++++++++------------------------
>> drivers/cxl/mem.c | 41 ++++++++++++++++
>> drivers/cxl/private.h | 10 ++++
>> 4 files changed, 90 insertions(+), 64 deletions(-)
>> create mode 100644 drivers/cxl/private.h
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index e370d733e440..14b4601faf66 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>> #include <linux/idr.h>
>> #include <linux/pci.h>
>> #include <cxlmem.h>
>> +#include "private.h"
>> #include "trace.h"
>> #include "core.h"
>>
>> @@ -648,42 +649,25 @@ static void detach_memdev(struct work_struct *work)
>>
>> static struct lock_class_key cxl_memdev_key;
>>
>> -static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>> - const struct file_operations *fops)
>> +int devm_cxl_memdev_add_or_reset(struct device *host, struct cxl_memdev *cxlmd)
>> {
>> - struct cxl_memdev *cxlmd;
>> - struct device *dev;
>> - struct cdev *cdev;
>> + struct device *dev = &cxlmd->dev;
>> + struct cdev *cdev = &cxlmd->cdev;
>> int rc;
>>
>> - cxlmd = kzalloc(sizeof(*cxlmd), GFP_KERNEL);
>> - if (!cxlmd)
>> - return ERR_PTR(-ENOMEM);
>> -
>> - rc = ida_alloc_max(&cxl_memdev_ida, CXL_MEM_MAX_DEVS - 1, GFP_KERNEL);
>> - if (rc < 0)
>> - goto err;
>> - cxlmd->id = rc;
>> - cxlmd->depth = -1;
>> -
>> - dev = &cxlmd->dev;
>> - device_initialize(dev);
>> - lockdep_set_class(&dev->mutex, &cxl_memdev_key);
>> - dev->parent = cxlds->dev;
>> - dev->bus = &cxl_bus_type;
>> - dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>> - dev->type = &cxl_memdev_type;
>> - device_set_pm_not_required(dev);
>> - INIT_WORK(&cxlmd->detach_work, detach_memdev);
>> -
>> - cdev = &cxlmd->cdev;
>> - cdev_init(cdev, fops);
>> - return cxlmd;
>> + rc = cdev_device_add(cdev, dev);
>> + if (rc) {
>> + /*
>> + * The cdev was briefly live, shutdown any ioctl operations that
>> + * saw that state.
>> + */
>> + cxl_memdev_shutdown(dev);
>> + return rc;
>> + }
>>
>> -err:
>> - kfree(cxlmd);
>> - return ERR_PTR(rc);
>> + return devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
>> }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_add_or_reset, "CXL");
>>
>> static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
>> unsigned long arg)
>> @@ -1051,50 +1035,41 @@ static const struct file_operations cxl_memdev_fops = {
>> .llseek = noop_llseek,
>> };
>>
>> -struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> - struct cxl_dev_state *cxlds)
>> +struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds)
>> {
>> - struct cxl_memdev *cxlmd;
>> + struct cxl_memdev *cxlmd __free(kfree) =
>> + kzalloc(sizeof(*cxlmd), GFP_KERNEL);
> Trivial and perhaps not worth the hassle.
> I'd pull this out of the declarations block to have
>
> struct device *dev;
> struct cdev *cdev;
> int rc;
>
> struct cxl_memdev *cxlmd __free(kfree) =
> kzalloc(sizeof(*cxlmd), GFP_KERNEL);
> if (!cxlmd)
> return ERR_PTR(-ENOMEM);
>
> That is treat the __free() related statement as an inline declaration of
> the type we only really allow for these.
>
>
>> struct device *dev;
>> struct cdev *cdev;
>> int rc;
>>
>> - cxlmd = cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
>> - if (IS_ERR(cxlmd))
>> - return cxlmd;
>>
>> - dev = &cxlmd->dev;
>> - rc = dev_set_name(dev, "mem%d", cxlmd->id);
>> - if (rc)
>> - goto err;
>> + if (!cxlmd)
>> + return ERR_PTR(-ENOMEM);
>>
>> - /*
>> - * Activate ioctl operations, no cxl_memdev_rwsem manipulation
>> - * needed as this is ordered with cdev_add() publishing the device.
>> - */
>> + rc = ida_alloc_max(&cxl_memdev_ida, CXL_MEM_MAX_DEVS - 1, GFP_KERNEL);
>> + if (rc < 0)
>> + return ERR_PTR(rc);
>> + cxlmd->id = rc;
>> + cxlmd->depth = -1;
>> cxlmd->cxlds = cxlds;
>> cxlds->cxlmd = cxlmd;
> These two lines weren't previously in cxl_memdev_alloc()
> I'd like a statement in the commit message of why they are now. It seems
> harmless because they are still ordered before the add and are
> ultimately freed
>
> I'm not immediately spotting why they now are. This whole code shift
> and complex diff is enough of a pain I'd be tempted to do the move first
> so that we can then see what is actually changed much more easily.
>
>
>>
>> - cdev = &cxlmd->cdev;
>> - rc = cdev_device_add(cdev, dev);
>> - if (rc)
>> - goto err;
>> -
>> - rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
>> - if (rc)
>> - return ERR_PTR(rc);
>> - return cxlmd;
>> + dev = &cxlmd->dev;
>> + device_initialize(dev);
>> + lockdep_set_class(&dev->mutex, &cxl_memdev_key);
>> + dev->parent = cxlds->dev;
>> + dev->bus = &cxl_bus_type;
>> + dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>> + dev->type = &cxl_memdev_type;
>> + device_set_pm_not_required(dev);
>> + INIT_WORK(&cxlmd->detach_work, detach_memdev);
>>
>> -err:
>> - /*
>> - * The cdev was briefly live, shutdown any ioctl operations that
>> - * saw that state.
>> - */
>> - cxl_memdev_shutdown(dev);
>> - put_device(dev);
>> - return ERR_PTR(rc);
>> + cdev = &cxlmd->cdev;
>> + cdev_init(cdev, &cxl_memdev_fops);
>> + return_ptr(cxlmd);
>> }
>> -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_alloc, "CXL");
>>
>> static void sanitize_teardown_notifier(void *data)
>> {
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index d2155f45240d..fa5d901ee817 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -7,6 +7,7 @@
>>
>> #include "cxlmem.h"
>> #include "cxlpci.h"
>> +#include "private.h"
>>
>> /**
>> * DOC: cxl mem
>> @@ -202,6 +203,45 @@ static int cxl_mem_probe(struct device *dev)
>> return devm_add_action_or_reset(dev, enable_suspend, NULL);
>> }
>>
>> +static void __cxlmd_free(struct cxl_memdev *cxlmd)
>> +{
>> + cxlmd->cxlds->cxlmd = NULL;
>> + put_device(&cxlmd->dev);
>> + kfree(cxlmd);
>> +}
>> +
>> +DEFINE_FREE(cxlmd_free, struct cxl_memdev *, __cxlmd_free(_T))
>> +
>> +/**
>> + * devm_cxl_add_memdev - Add a CXL memory device
>> + * @host: devres alloc/release context and parent for the memdev
>> + * @cxlds: CXL device state to associate with the memdev
>> + *
>> + * Upon return the device will have had a chance to attach to the
>> + * cxl_mem driver, but may fail if the CXL topology is not ready
>> + * (hardware CXL link down, or software platform CXL root not attached)
>> + */
>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> + struct cxl_dev_state *cxlds)
>> +{
>> + struct cxl_memdev *cxlmd __free(cxlmd_free) = cxl_memdev_alloc(cxlds);
>> + int rc;
>> +
>> + if (IS_ERR(cxlmd))
>> + return cxlmd;
>> +
>> + rc = dev_set_name(&cxlmd->dev, "mem%d", cxlmd->id);
>> + if (rc)
>> + return ERR_PTR(rc);
>> +
>> + rc = devm_cxl_memdev_add_or_reset(host, cxlmd);
>> + if (rc)
>> + return ERR_PTR(rc);
> Is the reference tracking right here? If the above call fails
> then it is possible cxl_memdev_unregister() has been called
> or just cxl_memdev_shutdown().
>
> If nothing else (and I suspect there is worse but haven't
> counted references) that will set
> cxlmd->cxlds = NULL;
> s part of cxl_memdev_shutdown()
> The __cxlmd_free() then dereferences that and boom.
>
>
>> +
>> + return no_free_ptr(cxlmd);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
next prev parent reply other threads:[~2025-11-14 11:10 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 15:36 [PATCH v20 00/22] Type2 device basic support alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 01/22] cxl/mem: Arrange for always-synchronous memdev attach alejandro.lucero-palau
2025-11-12 14:53 ` Jonathan Cameron
2025-11-14 11:10 ` Alejandro Lucero Palau [this message]
2025-11-14 15:24 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 02/22] cxl/port: Arrange for always synchronous endpoint attach alejandro.lucero-palau
2025-11-12 14:57 ` Jonathan Cameron
2025-11-13 23:01 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 03/22] cxl/mem: Introduce a memdev creation ->probe() operation alejandro.lucero-palau
2025-11-12 15:00 ` Jonathan Cameron
2025-11-13 23:02 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 04/22] cxl: Add type2 device basic support alejandro.lucero-palau
2025-11-12 15:33 ` Jonathan Cameron
2025-11-15 8:11 ` Alejandro Lucero Palau
2025-11-10 15:36 ` [PATCH v20 05/22] sfc: add cxl support alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 06/22] cxl: Move pci generic code alejandro.lucero-palau
2025-11-12 15:41 ` Jonathan Cameron
2025-11-15 8:12 ` Alejandro Lucero Palau
2025-11-17 15:00 ` Dave Jiang
2025-11-18 14:52 ` Alejandro Lucero Palau
2025-11-14 0:25 ` Alison Schofield
2025-11-14 16:15 ` Dave Jiang
2025-11-15 8:16 ` Alejandro Lucero Palau
2025-11-16 2:07 ` Alison Schofield
2025-11-18 14:55 ` Alejandro Lucero Palau
2025-11-10 15:36 ` [PATCH v20 07/22] cxl/sfc: Map cxl component regs alejandro.lucero-palau
2025-11-12 15:45 ` Jonathan Cameron
2025-11-12 15:52 ` Jonathan Cameron
2025-11-10 15:36 ` [PATCH v20 08/22] cxl/sfc: Initialize dpa without a mailbox alejandro.lucero-palau
2025-11-12 15:52 ` Jonathan Cameron
2025-11-10 15:36 ` [PATCH v20 09/22] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 10/22] sfc: create type2 cxl memdev alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 11/22] cxl: Define a driver interface for HPA free space enumeration alejandro.lucero-palau
2025-11-12 16:10 ` Jonathan Cameron
2025-11-19 17:16 ` Alejandro Lucero Palau
2025-11-10 15:36 ` [PATCH v20 12/22] sfc: get root decoder alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 13/22] cxl: Define a driver interface for DPA allocation alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 14/22] sfc: get endpoint decoder alejandro.lucero-palau
2025-11-13 23:52 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 15/22] cxl: Make region type based on endpoint type alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 16/22] cxl/region: Factor out interleave ways setup alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 17/22] cxl/region: Factor out interleave granularity setup alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 18/22] cxl: Allow region creation by type2 drivers alejandro.lucero-palau
2025-11-12 16:19 ` Jonathan Cameron
2025-11-19 18:31 ` Alejandro Lucero Palau
2025-11-14 0:00 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 19/22] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 20/22] sfc: create cxl region alejandro.lucero-palau
2025-11-12 16:21 ` Jonathan Cameron
2025-11-14 0:02 ` Dave Jiang
2025-11-10 15:36 ` [PATCH v20 21/22] cxl: Add function for obtaining region range alejandro.lucero-palau
2025-11-10 15:36 ` [PATCH v20 22/22] sfc: support pio mapping based on cxl alejandro.lucero-palau
2025-11-12 16:24 ` Jonathan Cameron
2025-11-14 0:03 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=374f8a2c-df06-4df9-8816-d91d3236cd58@amd.com \
--to=alucerop@amd.com \
--cc=alejandro.lucero-palau@amd.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=edward.cree@amd.com \
--cc=jonathan.cameron@huawei.com \
--cc=kuba@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).