public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Block <lkml@mageta.org>
To: Guenter Roeck <linux@roeck-us.net>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Ionut Nechita <ionut.nechita@windriver.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Lukas Wunner <lukas@wunner.de>, Keith Busch <kbusch@kernel.org>,
	Gerd Bayer <gbayer@linux.ibm.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Benjamin Block <bblock@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Farhan Ali <alifm@linux.ibm.com>,
	Julian Ruess <julianr@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug
Date: Tue, 17 Mar 2026 10:01:49 +0100	[thread overview]
Message-ID: <20260317090149.GA3835708@chlorum.ategam.org> (raw)
In-Reply-To: <0ca9e675-478c-411d-be32-e2d81439288f@roeck-us.net>

On Mon, Mar 16, 2026 at 06:57:53PM -0700, Guenter Roeck wrote:
> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > index 7de5b18647beb69127ba11234fb9f1dec9b50540..4a659c34935e116dd6d0b4ce42ed12a1ba9418d1 100644
> > --- a/drivers/pci/iov.c
> > +++ b/drivers/pci/iov.c
> > @@ -495,7 +495,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
> >  
> >  	if (num_vfs == 0) {
> >  		/* disable VFs */
> > +		pci_lock_rescan_remove();
> >  		ret = pdev->driver->sriov_configure(pdev, 0);
> > +		pci_unlock_rescan_remove();
> >  		goto exit;
> >  	}
> >  
> > @@ -507,7 +509,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
> >  		goto exit;
> >  	}
> >  
> > +	pci_lock_rescan_remove();
> >  	ret = pdev->driver->sriov_configure(pdev, num_vfs);
> > +	pci_unlock_rescan_remove();
> >  	if (ret < 0)
> >  		goto exit;
> >  
>
> Google's experimental AI review agent provided the following feedback
> on this patch.
> 
>   Could this introduce an AB-BA deadlock between the device lock and the
>   rescan/remove lock?
> 
>   Earlier in sriov_numvfs_store(), device_lock(&pdev->dev) is acquired. The
>   patch then attempts to acquire pci_lock_rescan_remove() while holding the
>   device lock.
> 
>   However, during a hotplug removal of the PF (for example, via sysfs),
>   remove_store() first acquires pci_lock_rescan_remove() and subsequently
>   calls pci_stop_and_remove_bus_device_locked(). That path eventually calls
>   device_release_driver(), which attempts to acquire device_lock(&pdev->dev).
> 
>   If sriov_numvfs_store() and a concurrent removal of the PF race, it appears
>   they could deadlock waiting on each other's locks.
> 
> The actual call sequence (at least in v6.12.y, where this patch was
> backported to) is as follows.
>   remove_store()
>   -> pci_stop_and_remove_bus_device_locked()
>      -> pci_lock_rescan_remove()
>         -> pci_stop_and_remove_bus_device()
> 	   -> pci_stop_bus_device()
> 	   -> pci_remove_bus_device()
> 	      -> pci_remove_bus()
> 	         -> device_unregister()
> 		    -> device_del()
> 		       -> device_lock()
> 
> I don't claim to fully understand the code, but the AI does seem to have a
> point. Please let me know if the AI analysis is correct or if it misses
> something.

Ugh. Well. That sucks. This lock is a sheer endless well of joy.
No, well, I think the AI is correct.

We've since discussed to move away from that patch again, or rather,
improve it further by applying this in top:
    https://lore.kernel.org/linux-pci/20260310074303.17480-2-ionut.nechita@windriver.com/

Because it improves some scenarios, such as driver core unbinds.
But looking at it from this angle, it suffers from the same AB-BA cyclic
deadlock.

  remove_store()
    |
    +- pci_stop_and_remove_bus_device_locked()
        |
        +- takes: pci_rescan_remove_lock                    # XXX
        |
        +- pci_stop_and_remove_bus_device()
            |
            +- pci_stop_bus_device()
                |
                +- pci_stop_dev()
                    |
                    +- device_release_driver()
                        |
                        +- device_release_driver_internal()
                            |
                            +- __device_driver_lock()
                                |
                                +- __device_driver_lock() - takes: pdev->dev

  unbind_store()
    |
    +- device_driver_detach()
        |
        +- device_release_driver_internal()
            |
            +- __device_driver_lock() - takes: pdev->dev    # XXX
            |
            +- __device_release_driver()
                |
                +- device_remove()
                    |
                    +- pci_device_remove()
                        |
                        +- vfio_pci_remove()
                            |
                            +- vfio_pci_core_sriov_configure()
                                |
                                +- pci_disable_sriov()
                                    |
                                    +- sriov_disable()
                                        |
                                        +- sriov_del_vfs()
                                            |
                                            +- takes: pci_rescan_remove_lock

And there is no way I can see how we can reverse the lock order in the
unbind_store() case, since everything above pci_device_remove() is owned
by the driver core itself. I don't see a way for us to put a hook in
there to take `pci_rescan_remove_lock`.

It's similar to what I'm trying to fix in:
    https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@linux.ibm.com/
Taking `pci_rescan_remove_lock` inside the release functions is fraught
with traps, especially with SR-IOV in the mix.

One quick idea: can we somehow unbind the device from any device driver
in remove_store() before calling
pci_stop_and_remove_bus_device_locked()?  That way we would not have any
SR-IOV functions attached anymore at the point where we remove the PF,
since the DD are expected to clean them up.

-- 
Best Regards und Beste Grüße, Benjamin Block
               PGP KeyID: 9610 2BB8 2E17 6F65 2362  6DF2 46E0 4E05 67A3 2E9E

  reply	other threads:[~2026-03-17  9:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-16 22:14 [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV Niklas Schnelle
2025-12-16 22:14 ` [PATCH v3 1/2] Revert "PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV" Niklas Schnelle
2025-12-16 22:14 ` [PATCH v3 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug Niklas Schnelle
2026-03-17  1:57   ` Guenter Roeck
2026-03-17  9:01     ` Benjamin Block [this message]
2026-03-17  9:46       ` Benjamin Block
2026-03-17 11:33         ` Benjamin Block
2026-03-17 13:08           ` Lukas Wunner
2026-03-17 13:18             ` Lukas Wunner
2026-03-17 17:09               ` Benjamin Block
2026-02-01 15:56 ` [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV Thorsten Leemhuis
2026-02-02 15:47   ` Niklas Schnelle
2026-02-03  0:48 ` Bjorn Helgaas
2026-02-23 14:10   ` Dragos Tatulea
2026-02-23 17:33     ` Benjamin Block
2026-02-23 18:34       ` Dragos Tatulea
2026-02-25 14:59         ` Dragos Tatulea
2026-02-25 18:32           ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260317090149.GA3835708@chlorum.ategam.org \
    --to=lkml@mageta.org \
    --cc=agordeev@linux.ibm.com \
    --cc=alifm@linux.ibm.com \
    --cc=bblock@linux.ibm.com \
    --cc=gbayer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=helgaas@kernel.org \
    --cc=ionut.nechita@windriver.com \
    --cc=julianr@linux.ibm.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=lukas@wunner.de \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox