public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: Gregory Price <gourry@gourry.net>,
	Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	Smita.KoralahalliChannabasappa@amd.com,
	alison.schofield@intel.com, terry.bowman@amd.com,
	alejandro.lucero-palau@amd.com, linux-pci@vger.kernel.org,
	Jonathan.Cameron@huawei.com,
	Ben Cheatham <benjamin.cheatham@amd.com>,
	Alejandro Lucero <alucerop@amd.com>
Subject: Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Date: Tue, 10 Feb 2026 08:05:21 -0700	[thread overview]
Message-ID: <a6b2ba96-ce22-426b-8277-8121ed77fa55@intel.com> (raw)
In-Reply-To: <aYq_zpNAgNgJY_P8@gourry-fedora-PF4VCD3F>



On 2/9/26 10:19 PM, Gregory Price wrote:
> On Mon, Dec 15, 2025 at 04:56:16PM -0800, Dan Williams wrote:
>> @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
>>  {
>>  	int rc;
>>  
>> +	/*
>> +	 * If @attach is provided fail if the driver is not attached upon
>> +	 * return. Note that failure here could be the result of a race to
>> +	 * teardown the CXL port topology. I.e. cxl_mem_probe() could have
>> +	 * succeeded and then cxl_mem unbound before the lock is acquired.
>> +	 */
>> +	guard(device)(&cxlmd->dev);
>> +	if (cxlmd->attach && !cxlmd->dev.driver) {
>> +		cxl_memdev_unregister(cxlmd);
>> +		return ERR_PTR(-ENXIO);
>> +	}
>> +
>>  	rc = devm_add_action_or_reset(cxlmd->cxlds->dev, cxl_memdev_unregister,
>>  				      cxlmd);
>>  	if (rc)
> 
> 
> This caused a deadlock when trying to unbind cxl_pci and bind a custom
> driver. 
> 
> The followwing analysis was produced by Chris Mason's kreview scripts,
> but it did resolve my deadlock, and the analysis seems pretty straight
> forward. 

The suggested resolution looks reasonable. Can you post a fix patch? Will need to apply it against 7.0-rc. It's too late to get it into the 7.0 merge PR.

DJ

> 
> Although it was likely caused by a qemu quirk I was dealing with at the
> same time (see the note about topology failing to enumerate).
> 
> ~Gregory
> 
> ---
> 
>     cxl_memdev_autoremove() takes device_lock(&cxlmd->dev) via guard(device)
>     and then calls cxl_memdev_unregister() when the attach callback was
>     provided but cxl_mem_probe() failed to bind.
> 
>     cxl_memdev_unregister() calls cdev_device_del() → device_del() →
>     bus_remove_device() → device_release_driver() which also takes
>     device_lock(), deadlocking the calling thread.
> 
>     This path is reached when a driver uses the @attach parameter to
>     devm_cxl_add_memdev() and the CXL topology fails to enumerate (e.g.
>     DVSEC range registers decode outside platform-defined CXL ranges,
>     causing the endpoint port probe to fail).
> 
>     Fix by using scoped_guard() and breaking out of the guard scope before
>     calling cxl_memdev_unregister(), so device_lock() is released first.
> 
> It suggested:
> ---
> 
> static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
> {
>         int rc;
> 
>         /*
>          * If @attach is provided fail if the driver is not attached upon
>          * return. Note that failure here could be the result of a race to
>          * teardown the CXL port topology. I.e. cxl_mem_probe() could have
>          * succeeded and then cxl_mem unbound before the lock is acquired.
>          *
>          * Check under device_lock but unregister outside of it:
>          * cxl_memdev_unregister() calls cdev_device_del() → device_del()
>          * → device_release_driver() which also takes device_lock().
>          */
>         scoped_guard(device, &cxlmd->dev) {
>                 if (cxlmd->attach && !cxlmd->dev.driver) {
>                         /* Drop lock before unregister to avoid deadlock */
>                         break;
>                 }
> 
>                 rc = devm_add_action_or_reset(cxlmd->cxlds->dev,
>                                               cxl_memdev_unregister, cxlmd);
>                 if (rc)
>                         return ERR_PTR(rc);
> 
>                 return cxlmd;
>         }
> 
>         /* Reached only when attach failed — lock is released */
>         cxl_memdev_unregister(cxlmd);
>         return ERR_PTR(-ENXIO);
> }
> 


      reply	other threads:[~2026-02-10 15:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-16  0:56 [PATCH v2 0/6] cxl: Initialization reworks to support Soft Reserve Recovery and Accelerator Memory Dan Williams
2025-12-16  0:56 ` [PATCH v2 1/6] cxl/mem: Fix devm_cxl_memdev_edac_release() confusion Dan Williams
2025-12-16  0:56 ` [PATCH v2 2/6] cxl/mem: Arrange for always-synchronous memdev attach Dan Williams
2025-12-16  0:56 ` [PATCH v2 3/6] cxl/port: Arrange for always synchronous endpoint attach Dan Williams
2025-12-16  0:56 ` [PATCH v2 4/6] cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup Dan Williams
2025-12-17 15:48   ` Jonathan Cameron
2025-12-16  0:56 ` [PATCH v2 5/6] cxl/mem: Drop @host argument to devm_cxl_add_memdev() Dan Williams
2025-12-16  0:56 ` [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation Dan Williams
2025-12-17 15:57   ` Jonathan Cameron
2025-12-17 16:27     ` dan.j.williams
2025-12-18 14:46       ` Jonathan Cameron
2025-12-18 19:50         ` dan.j.williams
2025-12-19 10:26           ` Jonathan Cameron
2025-12-20  7:31   ` Alejandro Lucero Palau
2026-02-10  5:19   ` Gregory Price
2026-02-10 15:05     ` Dave Jiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a6b2ba96-ce22-426b-8277-8121ed77fa55@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=alucerop@amd.com \
    --cc=benjamin.cheatham@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=gourry@gourry.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=terry.bowman@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox