Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Srirangan Madhavan" <smadhavan@nvidia.com>
Cc: linux-pci@vger.kernel.org, linux-cxl@vger.kernel.org
Subject: Re: [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset
Date: Tue, 23 Jun 2026 03:42:37 +0000	[thread overview]
Message-ID: <20260623034237.C8F3F1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260623032453.3404772-9-smadhavan@nvidia.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] AB-BA deadlock between `device_lock` and `pci_bus_sem` during CXL reset
- [Low] Mixed usage of goto-based error handling and scope-based cleanup helpers (`scoped_guard`)
--

commit e204e091df9c0890781be70b3733c1aeae326d91
Author: Srirangan Madhavan <smadhavan@nvidia.com>

cxl: Coordinate sibling functions for CXL reset

CXL Device Reset affects all CXL.cache and CXL.mem functions in the reset
scope. Lock same-scope siblings with pci_dev_trylock(), save/disable them,
drain pending transactions, and hold IOMMU reset blocks until recovery.

> diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
> index 1ae714a3595cb..69bcfab898589 100644
> --- a/drivers/cxl/core/reset.c
> +++ b/drivers/cxl/core/reset.c

[ ... ]

> @@ -944,11 +1045,24 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)

[Severity: High]
Could this lead to an AB-BA deadlock between the device lock and pci_bus_sem?

When the PCI core invokes a device-specific reset method, it holds the
target device's lock (device_lock(&pdev->dev)). If cxl_reset_function()
subsequently calls cxl_reset_collect_siblings() which uses pci_walk_bus(),
that will attempt to acquire a read lock on pci_bus_sem.

The standard PCI lock acquisition order is pci_bus_sem followed by
device_lock (for example, in pcie_do_recovery()). This inversion could
create a deadlock if another thread (like hotplug) requests a write lock
on pci_bus_sem, queuing the read lock request and blocking all involved
threads.

>  	if (rc)
>  		goto out;
>  
> +	rc = cxl_pci_functions_lock(&ctx);
> +	if (rc)
> +		goto out_unlock;
> +
> +	rc = cxl_pci_functions_reset_prepare(&ctx);
> +	if (rc)
> +		goto out_functions_done;
> +
>  	scoped_guard(rwsem_write, &cxl_rwsem.region) {
> -		rc = cxl_hdm_ranges_prepare(&range_ctx, pdev);
> +		rc = cxl_hdm_ranges_prepare(&range_ctx, &ctx);
>  		if (!rc)
>  			rc = cxl_reset_execute(pdev, dvsec);
>  	}
> +
> +out_functions_done:

[Severity: Low]
Does this mix goto-based error handling with scope-based cleanup helpers?

According to the cleanup guidelines, functions should not mix goto-based
cleanup labels (like out_functions_done) with scope-based helpers like
scoped_guard(). All resources needing a cleanup should either be converted
to scope-based cleanup, or none of them should use it.

> +	cxl_pci_functions_reset_done(&ctx);
> +out_unlock:
> +	cxl_pci_functions_unlock(&ctx);
>  out:
>  	cxl_hdm_range_context_destroy(&range_ctx);
>  	cxl_reset_context_destroy(&ctx);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623032453.3404772-1-smadhavan@nvidia.com?part=8

  reply	other threads:[~2026-06-23  3:42 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration Srirangan Madhavan
2026-06-23  3:45   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
2026-06-23  3:34   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 05/11] cxl: Add CXL Device Reset helper Srirangan Madhavan
2026-06-23  3:36   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset Srirangan Madhavan
2026-06-23  3:33   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope Srirangan Madhavan
2026-06-23  3:34   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot [this message]
2026-06-23  3:24 ` [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset Srirangan Madhavan
2026-06-23  3:39   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method Srirangan Madhavan
2026-06-23  3:47   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 11/11] Documentation/ABI: Document CXL Reset " Srirangan Madhavan
2026-06-23  3:35   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623034237.C8F3F1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=smadhavan@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox