* [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock
@ 2026-03-18 21:03 Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 1/2] PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect sriov_add_vfs/sriov_del_vfs Ionut Nechita (Wind River)
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-03-18 21:03 UTC (permalink / raw)
To: linux-pci, bhelgaas
Cc: helgaas, sebott, schnelle, bblock, lkml, alifm, julianr, dtatulea,
mani, lukas, kbusch, linux, ionut_n2001, sunlightlinux,
linux-kernel, stable, intel-xe, matthew.brost, michal.wajdeczko,
piotr.piorkowski, Ionut Nechita
From: Ionut Nechita <ionut.nechita@windriver.com>
Hi Bjorn,
This is v10 of the fix for the SR-IOV race between driver .remove()
and concurrent hotplug events. v10 adds a second patch to fix the
AB-BA deadlock between device_lock and pci_rescan_remove_lock that
was reported by Guenter Roeck (via Google's AI review agent) and
confirmed by Benjamin Block.
The AB-BA deadlock:
CPU0 (remove_store) CPU1 (unbind_store)
-------------------- --------------------
pci_lock_rescan_remove()
device_lock()
driver .remove()
sriov_del_vfs()
pci_lock_rescan_remove() <-- WAITS
pci_stop_bus_device()
device_release_driver()
device_lock() <-- WAITS
Patch 2/2 fixes this by calling device_release_driver() in
remove_store() before pci_stop_and_remove_bus_device_locked(), so
that the driver is already unbound when pci_rescan_remove_lock is
acquired. Both paths then take locks in the same order: device_lock
first, then pci_rescan_remove_lock.
Note: the concurrent unbind_store + hotplug-event case (where the
hotplug handler takes pci_rescan_remove_lock before device_lock)
remains a known limitation. This is a pre-existing issue that
Benjamin Block is addressing separately in:
https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@linux.ibm.com/
Changes since v9 (Mar 10):
- NEW patch 2/2: fix AB-BA deadlock in remove_store() by calling
device_release_driver() before pci_stop_and_remove_bus_device_locked(),
as suggested by Benjamin Block (addresses Guenter Roeck's report)
- Patch 1/2 unchanged from v9
Changes since v8 (Mar 9):
- Added Reviewed-by from Niklas Schnelle (IBM) and Tested-by (s390)
- Added Fixes tags for the three related commits
- Removed rescan/remove locking from sriov_numvfs_store() since
locking is now handled in sriov_add_vfs() and sriov_del_vfs()
- Rebased on linux-next (20260309)
This race has been independently observed by multiple organizations:
- IBM (s390 platform-generated hot-unplug events racing with
sriov_del_vfs during PF driver unload)
- NVIDIA (tested by Dragos Tatulea in earlier versions)
- Intel (xe driver hitting lockdep warnings and deadlocks when
calling pci_disable_sriov from .remove)
- Wind River (original reporter and patch author)
Test environment:
- Tested on s390 by Benjamin Block and Niklas Schnelle (IBM)
- Tested on x86_64 with Intel and NVIDIA SR-IOV devices (earlier
versions)
Based on linux-next (next-20260318).
Link: https://lore.kernel.org/linux-pci/20260214193235.262219-3-ionut.nechita@windriver.com/ [v1]
Link: https://lore.kernel.org/linux-pci/20260219212648.82606-1-ionut.nechita@windriver.com/ [v2]
Link: https://lore.kernel.org/lkml/20260225202434.18737-1-ionut.nechita@windriver.com/ [v3]
Link: https://lore.kernel.org/linux-pci/20260228120138.51197-2-ionut.nechita@windriver.com/ [v4]
Link: https://lore.kernel.org/linux-pci/20260303080903.28693-1-ionut.nechita@windriver.com/ [v5]
Link: https://lore.kernel.org/linux-pci/20260306082108.17322-1-ionut.nechita@windriver.com/ [v6]
Link: https://lore.kernel.org/linux-pci/20260308135352.80346-1-ionut.nechita@windriver.com/ [v7]
Link: https://lore.kernel.org/linux-pci/20260309194920.16459-1-ionut.nechita@windriver.com/ [v8]
Link: https://lore.kernel.org/linux-pci/20260310074303.17480-1-ionut.nechita@windriver.com/ [v9]
Ionut Nechita (2):
PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect
sriov_add_vfs/sriov_del_vfs
PCI: Fix AB-BA deadlock between device_lock and
pci_rescan_remove_lock in remove_store
drivers/pci/iov.c | 9 +++++----
drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
drivers/pci/probe.c | 11 +++++++++--
3 files changed, 33 insertions(+), 7 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v10 1/2] PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect sriov_add_vfs/sriov_del_vfs
2026-03-18 21:03 [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Ionut Nechita (Wind River)
@ 2026-03-18 21:03 ` Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 2/2] PCI: Fix AB-BA deadlock between device_lock and pci_rescan_remove_lock in remove_store Ionut Nechita (Wind River)
2026-03-19 12:31 ` [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Niklas Schnelle
2 siblings, 0 replies; 7+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-03-18 21:03 UTC (permalink / raw)
To: linux-pci, bhelgaas
Cc: helgaas, sebott, schnelle, bblock, lkml, alifm, julianr, dtatulea,
mani, lukas, kbusch, linux, ionut_n2001, sunlightlinux,
linux-kernel, stable, intel-xe, matthew.brost, michal.wajdeczko,
piotr.piorkowski, Ionut Nechita (Wind River)
After reverting commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove
locking when enabling/disabling SR-IOV") and moving the lock to
sriov_numvfs_store(), the path through driver .remove() (e.g. rmmod,
or manual unbind) that calls pci_disable_sriov() directly remains
unprotected against concurrent hotplug events. This affects any SR-IOV
capable driver that calls pci_disable_sriov() from its .remove()
callback (i40e, ice, mlx5, bnxt, etc.).
On s390, platform-generated hot-unplug events for VFs can race with
sriov_del_vfs() when a PF driver is being unloaded. The platform event
handler takes pci_rescan_remove_lock, but sriov_del_vfs() does not,
leading to double removal and list corruption.
We cannot use a plain mutex_lock() here because sriov_del_vfs() may also
be called from paths that already hold pci_rescan_remove_lock (e.g.
remove_store -> pci_stop_and_remove_bus_device_locked, or
sriov_numvfs_store with the lock taken by the previous patch). Using
mutex_lock() in those cases would deadlock.
Make pci_lock_rescan_remove() itself reentrant using mutex_get_owner()
and a reentrant depth counter, as suggested by Lukas Wunner and
Benjamin Block, since these recursive locking scenarios exist elsewhere
in the PCI subsystem:
- If the lock is already held by the current task (checked via
mutex_get_owner()): increments the reentrant counter and returns
without re-acquiring, avoiding deadlock.
- If the lock is held by another task: blocks until the lock is
released, providing complete serialization.
- If the lock is not held: acquires the mutex normally.
pci_unlock_rescan_remove() decrements the reentrant counter if it is
non-zero, otherwise releases the mutex.
This approach keeps the API unchanged: callers simply pair lock/unlock
calls without needing to track any return value or use separate
reentrant variants.
Add pci_lock_rescan_remove()/pci_unlock_rescan_remove() calls to
sriov_add_vfs() and sriov_del_vfs() to protect VF addition and
removal against concurrent hotplug events.
Remove the rescan/remove locking from sriov_numvfs_store() that was
introduced by commit a5338e365c45 ("PCI/IOV: Fix race between SR-IOV
enable/disable and hotplug"), since the locking is now handled directly
in sriov_add_vfs() and sriov_del_vfs() where it is actually needed,
reducing the lock scope.
Fixes: 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()")
Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
Fixes: a5338e365c45 ("PCI/IOV: Fix race between SR-IOV enable/disable and hotplug")
Cc: stable@vger.kernel.org
Suggested-by: Lukas Wunner <lukas@wunner.de>
Suggested-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Tested-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390
Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com>
---
drivers/pci/iov.c | 9 +++++----
drivers/pci/probe.c | 11 +++++++++--
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 91ac4e37ecb9..7ed902539051 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -495,9 +495,7 @@ static ssize_t sriov_numvfs_store(struct device *dev,
if (num_vfs == 0) {
/* disable VFs */
- pci_lock_rescan_remove();
ret = pdev->driver->sriov_configure(pdev, 0);
- pci_unlock_rescan_remove();
goto exit;
}
@@ -509,9 +507,7 @@ static ssize_t sriov_numvfs_store(struct device *dev,
goto exit;
}
- pci_lock_rescan_remove();
ret = pdev->driver->sriov_configure(pdev, num_vfs);
- pci_unlock_rescan_remove();
if (ret < 0)
goto exit;
@@ -633,15 +629,18 @@ static int sriov_add_vfs(struct pci_dev *dev, u16 num_vfs)
if (dev->no_vf_scan)
return 0;
+ pci_lock_rescan_remove();
for (i = 0; i < num_vfs; i++) {
rc = pci_iov_add_virtfn(dev, i);
if (rc)
goto failed;
}
+ pci_unlock_rescan_remove();
return 0;
failed:
while (i--)
pci_iov_remove_virtfn(dev, i);
+ pci_unlock_rescan_remove();
return rc;
}
@@ -766,8 +765,10 @@ static void sriov_del_vfs(struct pci_dev *dev)
struct pci_sriov *iov = dev->sriov;
int i;
+ pci_lock_rescan_remove();
for (i = 0; i < iov->num_VFs; i++)
pci_iov_remove_virtfn(dev, i);
+ pci_unlock_rescan_remove();
}
static void sriov_disable(struct pci_dev *dev)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index bccc7a4bdd79..ce4d351b5aa2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3509,16 +3509,23 @@ EXPORT_SYMBOL_GPL(pci_rescan_bus);
* routines should always be executed under this mutex.
*/
DEFINE_MUTEX(pci_rescan_remove_lock);
+static size_t pci_rescan_remove_reentrant_count;
void pci_lock_rescan_remove(void)
{
- mutex_lock(&pci_rescan_remove_lock);
+ if (mutex_get_owner(&pci_rescan_remove_lock) == (unsigned long)current)
+ pci_rescan_remove_reentrant_count++;
+ else
+ mutex_lock(&pci_rescan_remove_lock);
}
EXPORT_SYMBOL_GPL(pci_lock_rescan_remove);
void pci_unlock_rescan_remove(void)
{
- mutex_unlock(&pci_rescan_remove_lock);
+ if (pci_rescan_remove_reentrant_count > 0)
+ pci_rescan_remove_reentrant_count--;
+ else
+ mutex_unlock(&pci_rescan_remove_lock);
}
EXPORT_SYMBOL_GPL(pci_unlock_rescan_remove);
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v10 2/2] PCI: Fix AB-BA deadlock between device_lock and pci_rescan_remove_lock in remove_store
2026-03-18 21:03 [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 1/2] PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect sriov_add_vfs/sriov_del_vfs Ionut Nechita (Wind River)
@ 2026-03-18 21:03 ` Ionut Nechita (Wind River)
2026-03-19 12:31 ` [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Niklas Schnelle
2 siblings, 0 replies; 7+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-03-18 21:03 UTC (permalink / raw)
To: linux-pci, bhelgaas
Cc: helgaas, sebott, schnelle, bblock, lkml, alifm, julianr, dtatulea,
mani, lukas, kbusch, linux, ionut_n2001, sunlightlinux,
linux-kernel, stable, intel-xe, matthew.brost, michal.wajdeczko,
piotr.piorkowski, Ionut Nechita (Wind River)
remove_store() calls pci_stop_and_remove_bus_device_locked() which
takes pci_rescan_remove_lock first, then device_lock during driver
release. Meanwhile, unbind_store() takes device_lock first (via
device_driver_detach), and the driver's .remove() callback may call
pci_disable_sriov() -> sriov_del_vfs() -> pci_lock_rescan_remove().
This creates an AB-BA deadlock:
CPU0 (remove_store) CPU1 (unbind_store)
-------------------- --------------------
pci_lock_rescan_remove()
device_lock()
driver .remove()
sriov_del_vfs()
pci_lock_rescan_remove() <-- WAITS
pci_stop_bus_device()
device_release_driver()
device_lock() <-- WAITS
Fix this by calling device_release_driver() in remove_store() before
pci_stop_and_remove_bus_device_locked(). This ensures the driver's
.remove() callback (including any SR-IOV VF cleanup) runs to completion
before pci_rescan_remove_lock is acquired, making both paths take locks
in the same order: device_lock first, then pci_rescan_remove_lock.
After device_release_driver() returns, the driver is already unbound,
so the subsequent device_release_driver() call inside
pci_stop_and_remove_bus_device_locked() becomes a no-op.
Fixes: a5338e365c45 ("PCI/IOV: Fix race between SR-IOV enable/disable and hotplug")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/linux-pci/0ca9e675-478c-411d-be32-e2d81439288f@roeck-us.net/
Reported-by: Benjamin Block <bblock@linux.ibm.com>
Closes: https://lore.kernel.org/linux-pci/20260317090149.GA3835708@chlorum.ategam.org/
Suggested-by: Benjamin Block <bblock@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com>
---
drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c7780adf564e..e94ea71a4eb8 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -521,8 +521,26 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
if (kstrtoul(buf, 0, &val) < 0)
return -EINVAL;
- if (val && device_remove_file_self(dev, attr))
+ if (val && device_remove_file_self(dev, attr)) {
+ /*
+ * Unbind the driver before removing the device to avoid
+ * an AB-BA deadlock between device_lock and
+ * pci_rescan_remove_lock. Without this, remove_store
+ * takes pci_rescan_remove_lock first (via
+ * pci_stop_and_remove_bus_device_locked) and then
+ * device_lock during driver release, while a concurrent
+ * unbind_store (or sriov_numvfs_store) takes device_lock
+ * first and then pci_rescan_remove_lock (via
+ * sriov_del_vfs), creating a circular dependency.
+ *
+ * By unbinding first, the driver's .remove() callback
+ * (including any SR-IOV VF cleanup) completes before
+ * pci_rescan_remove_lock is acquired, ensuring both
+ * paths take locks in the same order.
+ */
+ device_release_driver(dev);
pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
+ }
return count;
}
static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0220, NULL,
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock
2026-03-18 21:03 [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 1/2] PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect sriov_add_vfs/sriov_del_vfs Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 2/2] PCI: Fix AB-BA deadlock between device_lock and pci_rescan_remove_lock in remove_store Ionut Nechita (Wind River)
@ 2026-03-19 12:31 ` Niklas Schnelle
2026-03-19 19:25 ` Guenter Roeck
2026-03-19 20:27 ` Ionut Nechita (Wind River)
2 siblings, 2 replies; 7+ messages in thread
From: Niklas Schnelle @ 2026-03-19 12:31 UTC (permalink / raw)
To: Ionut Nechita (Wind River), linux-pci, bhelgaas
Cc: helgaas, sebott, bblock, lkml, alifm, julianr, dtatulea, mani,
lukas, kbusch, linux, ionut_n2001, sunlightlinux, linux-kernel,
stable, intel-xe, matthew.brost, michal.wajdeczko,
piotr.piorkowski
On Wed, 2026-03-18 at 23:03 +0200, Ionut Nechita (Wind River) wrote:
> From: Ionut Nechita <ionut.nechita@windriver.com>
>
> Hi Bjorn,
>
> This is v10 of the fix for the SR-IOV race between driver .remove()
> and concurrent hotplug events. v10 adds a second patch to fix the
> AB-BA deadlock between device_lock and pci_rescan_remove_lock that
> was reported by Guenter Roeck (via Google's AI review agent) and
> confirmed by Benjamin Block.
>
> The AB-BA deadlock:
>
> CPU0 (remove_store) CPU1 (unbind_store)
> -------------------- --------------------
> pci_lock_rescan_remove()
> device_lock()
> driver .remove()
> sriov_del_vfs()
> pci_lock_rescan_remove() <-- WAITS
> pci_stop_bus_device()
> device_release_driver()
> device_lock() <-- WAITS
>
> Patch 2/2 fixes this by calling device_release_driver() in
> remove_store() before pci_stop_and_remove_bus_device_locked(), so
> that the driver is already unbound when pci_rescan_remove_lock is
> acquired. Both paths then take locks in the same order: device_lock
> first, then pci_rescan_remove_lock.
>
> Note: the concurrent unbind_store + hotplug-event case (where the
> hotplug handler takes pci_rescan_remove_lock before device_lock)
> remains a known limitation. This is a pre-existing issue that
> Benjamin Block is addressing separately in:
> https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@linux.ibm.com/
>
--- snip ---
>
> Ionut Nechita (2):
> PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect
> sriov_add_vfs/sriov_del_vfs
> PCI: Fix AB-BA deadlock between device_lock and
> pci_rescan_remove_lock in remove_store
>
> drivers/pci/iov.c | 9 +++++----
> drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
> drivers/pci/probe.c | 11 +++++++++--
> 3 files changed, 33 insertions(+), 7 deletions(-)
>
> --
> 2.43.0
Hi Ionut,
For your awareness, I saw that this series has some findings on
Google's new Sashiko AI reviewing tool[0]. At a quick glance the
findings seem like at least reasonable concerns to me. I'm still
looking at this independently also of course.
Thanks,
Niklas
[0]
https://sashiko.dev/#/patchset/20260318210316.61975-1-ionut.nechita%40windriver.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock
2026-03-19 12:31 ` [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Niklas Schnelle
@ 2026-03-19 19:25 ` Guenter Roeck
2026-03-19 20:27 ` Ionut Nechita (Wind River)
1 sibling, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2026-03-19 19:25 UTC (permalink / raw)
To: Niklas Schnelle, Ionut Nechita (Wind River), linux-pci, bhelgaas
Cc: helgaas, sebott, bblock, lkml, alifm, julianr, dtatulea, mani,
lukas, kbusch, ionut_n2001, sunlightlinux, linux-kernel, stable,
intel-xe, matthew.brost, michal.wajdeczko, piotr.piorkowski
On 3/19/26 05:31, Niklas Schnelle wrote:
> On Wed, 2026-03-18 at 23:03 +0200, Ionut Nechita (Wind River) wrote:
>> From: Ionut Nechita <ionut.nechita@windriver.com>
>>
>> Hi Bjorn,
>>
>> This is v10 of the fix for the SR-IOV race between driver .remove()
>> and concurrent hotplug events. v10 adds a second patch to fix the
>> AB-BA deadlock between device_lock and pci_rescan_remove_lock that
>> was reported by Guenter Roeck (via Google's AI review agent) and
>> confirmed by Benjamin Block.
>>
>> The AB-BA deadlock:
>>
>> CPU0 (remove_store) CPU1 (unbind_store)
>> -------------------- --------------------
>> pci_lock_rescan_remove()
>> device_lock()
>> driver .remove()
>> sriov_del_vfs()
>> pci_lock_rescan_remove() <-- WAITS
>> pci_stop_bus_device()
>> device_release_driver()
>> device_lock() <-- WAITS
>>
>> Patch 2/2 fixes this by calling device_release_driver() in
>> remove_store() before pci_stop_and_remove_bus_device_locked(), so
>> that the driver is already unbound when pci_rescan_remove_lock is
>> acquired. Both paths then take locks in the same order: device_lock
>> first, then pci_rescan_remove_lock.
>>
>> Note: the concurrent unbind_store + hotplug-event case (where the
>> hotplug handler takes pci_rescan_remove_lock before device_lock)
>> remains a known limitation. This is a pre-existing issue that
>> Benjamin Block is addressing separately in:
>> https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@linux.ibm.com/
>>
> --- snip ---
>>
>> Ionut Nechita (2):
>> PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect
>> sriov_add_vfs/sriov_del_vfs
>> PCI: Fix AB-BA deadlock between device_lock and
>> pci_rescan_remove_lock in remove_store
>>
>> drivers/pci/iov.c | 9 +++++----
>> drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
>> drivers/pci/probe.c | 11 +++++++++--
>> 3 files changed, 33 insertions(+), 7 deletions(-)
>>
>> --
>> 2.43.0
>
> Hi Ionut,
>
> For your awareness, I saw that this series has some findings on
> Google's new Sashiko AI reviewing tool[0]. At a quick glance the
> findings seem like at least reasonable concerns to me. I'm still
> looking at this independently also of course.
>
It is almost scary to see how many problems Sashiko is able to find.
The AB-BA deadlock that the second patch in the series tries to fix
was reported by a prototype version of it when running it on an LTS
backport.
Guenter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock
2026-03-19 12:31 ` [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Niklas Schnelle
2026-03-19 19:25 ` Guenter Roeck
@ 2026-03-19 20:27 ` Ionut Nechita (Wind River)
2026-03-24 17:07 ` Benjamin Block
1 sibling, 1 reply; 7+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-03-19 20:27 UTC (permalink / raw)
To: schnelle
Cc: alifm, bblock, bhelgaas, dtatulea, helgaas, intel-xe,
ionut.nechita, ionut_n2001, julianr, kbusch, linux-kernel,
linux-pci, linux, lkml, lukas, mani, matthew.brost,
michal.wajdeczko, piotr.piorkowski, sebott, stable, sunlightlinux
On Thu, 19 Mar 2026 13:31:39 +0100, Niklas Schnelle wrote:
> For your awareness, I saw that this series has some findings on
> Google's new Sashiko AI reviewing tool[0]. At a quick glance the
> findings seem like at least reasonable concerns to me. I'm still
> looking at this independently also of course.
Hi Niklas,
Thanks for pointing this out. I went through each of the Sashiko
findings carefully(+ AI). Here is my analysis:
1) Incomplete Deadlock Fix (Hierarchical Topology)
— remove_store on a bridge still has the AB-BA between
pci_rescan_remove_lock and device_lock(child)
This is correct, but it is a pre-existing issue, not introduced
by this series. The v10 cover letter explicitly acknowledges this:
"Note: the concurrent unbind_store + hotplug-event case (where
the hotplug handler takes pci_rescan_remove_lock before
device_lock) remains a known limitation."
and points to Benjamin Block's separate series that addresses the
broader hierarchical locking problem:
https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@linux.ibm.com/
Patch 2/2 fixes the specific deadlock reported by Guenter Roeck
(remove_store vs unbind_store on the same device). It does not
claim to solve all lock ordering issues in the PCI subsystem.
Trying to do so in a stable-targeted fix would be too invasive.
2) Driver Teardown Ordering Violation
— unbinding the bridge driver before its children causes PCIe
errors because pci_disable_device() clears Memory/IO Enable
This is a partially false positive. Sashiko assumes that
pci_disable_device() on a bridge disables forwarding of
transactions to child devices. It does not — pci_disable_device()
clears the bus master bit and decrements the enable count, but
does not touch the Memory Space Enable or I/O Space Enable bits
that control transaction forwarding on bridges. Child devices
can still access their MMIO regions through the bridge.
What does happen is that pcie_portdrv_remove() tears down port
services (AER, hotplug, PME, DPC) before the children are
unbound. This means error reporting is degraded during child
teardown, but it does not cause Unsupported Requests, Completer
Aborts, or system crashes as Sashiko claims.
Also worth noting: this scenario (remove_store on a bridge)
is not the path that was deadlocking. The reported deadlocks
were on the PF device itself, not on its parent bridge.
3) TOCTOU Race Condition / Lock Window Vulnerability
— a driver can rebind between device_release_driver() and
pci_stop_and_remove_bus_device_locked()
This is theoretically valid but practically impossible. The
window is a few instructions wide. For this race to trigger:
a) device_remove_file_self() has already removed the "remove"
sysfs attribute, signaling the device is being torn down
b) a bind_store or udev probe would need to fire in exactly
that window
c) the newly bound driver's probe() would need to call
pci_enable_sriov() and block on pci_rescan_remove_lock
This is the same pattern used elsewhere in the kernel (e.g.,
the existing remove_store already had no synchronization between
device_remove_file_self() and pci_stop_and_remove_bus_device_locked()
— the patch just adds one more call in between).
If this is a real concern, it would need to be addressed as a
separate improvement, not as a blocker for this fix.
4) Use-After-Free in remove_store
— device_remove_file_self() breaks active sysfs protection,
allowing concurrent device_del() to free dev
This is a false positive. kernfs_remove_self() does three things:
a) breaks active protection (decrements active ref)
b) calls __kernfs_remove(kn) to unlink the node
c) calls kernfs_unbreak_active_protection(kn) to restore the
active ref
After step (c), the active reference is restored. The sysfs
write handler (kernfs_fop_write_iter) still holds this active
reference for the duration of the store callback. A concurrent
device_del() calling kernfs_remove() on the parent directory
will call kernfs_drain() on all remaining children, which blocks
until active references drop to zero. Since remove_store still
holds the active ref (restored in step c), device_del() will
block until remove_store returns.
Additionally, pci_rescan_remove_lock serializes any concurrent
PCI device removal (hotplug events also take this lock), so
the scenario described by Sashiko cannot actually occur.
This is the same lifetime model that the existing remove_store
(without this patch) has always relied on.
In summary: finding 1 is a known pre-existing limitation documented
in the cover letter, findings 2 and 4 are false positives, and
finding 3 is a theoretical race that is not practically exploitable
and is not new to this patch.
I don't think a v11 is needed based on these findings. But I'm
happy to discuss further if you see something I missed in my
analysis.
Thanks,
Ionut
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock
2026-03-19 20:27 ` Ionut Nechita (Wind River)
@ 2026-03-24 17:07 ` Benjamin Block
0 siblings, 0 replies; 7+ messages in thread
From: Benjamin Block @ 2026-03-24 17:07 UTC (permalink / raw)
To: Ionut Nechita (Wind River)
Cc: schnelle, alifm, bhelgaas, dtatulea, helgaas, intel-xe,
ionut_n2001, julianr, kbusch, linux-kernel, linux-pci, linux,
lkml, lukas, mani, matthew.brost, michal.wajdeczko,
piotr.piorkowski, sebott, stable, sunlightlinux
On Thu, Mar 19, 2026 at 10:27:55PM +0200, Ionut Nechita (Wind River) wrote:
> On Thu, 19 Mar 2026 13:31:39 +0100, Niklas Schnelle wrote:
> > For your awareness, I saw that this series has some findings on
> > Google's new Sashiko AI reviewing tool[0]. At a quick glance the
> > findings seem like at least reasonable concerns to me. I'm still
> > looking at this independently also of course.
>
--8<--
> 3) TOCTOU Race Condition / Lock Window Vulnerability
> — a driver can rebind between device_release_driver() and
> pci_stop_and_remove_bus_device_locked()
>
> This is theoretically valid but practically impossible. The
> window is a few instructions wide. For this race to trigger:
>
> a) device_remove_file_self() has already removed the "remove"
> sysfs attribute, signaling the device is being torn down
> b) a bind_store or udev probe would need to fire in exactly
> that window
> c) the newly bound driver's probe() would need to call
> pci_enable_sriov() and block on pci_rescan_remove_lock
>
> This is the same pattern used elsewhere in the kernel (e.g.,
> the existing remove_store already had no synchronization between
> device_remove_file_self() and pci_stop_and_remove_bus_device_locked()
> — the patch just adds one more call in between).
>
> If this is a real concern, it would need to be addressed as a
> separate improvement, not as a blocker for this fix.
I haven't had time to fully review all this yet, but one quick comment: after
the first idea to unbind the device driver I also realized we could have a
race here between unbinding, and then possibly re-binding. We could probably
prevent that by marking the device as dead:
+ if (val && device_remove_file_self(dev, attr)) {
+ device_lock(dev);
+ kill_device(dev);
+ device_unlock(dev);
This doesn't modify the reference count or anything, but only sets the private
member of the `struct device` `dead` to true. This can't be undone using the
device core's public API, and once set, a device can not be bound to a new
device-driver.
This should prevent any such race AFAICS. The unbind is protected by the
device-mutex, so once the flag is set, and the unbind is done, this device
will stay unbound.
It's not really "pretty" though.
--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller
Sitz der Ges.: Ehningen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-24 17:07 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 21:03 [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 1/2] PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect sriov_add_vfs/sriov_del_vfs Ionut Nechita (Wind River)
2026-03-18 21:03 ` [PATCH v10 2/2] PCI: Fix AB-BA deadlock between device_lock and pci_rescan_remove_lock in remove_store Ionut Nechita (Wind River)
2026-03-19 12:31 ` [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock Niklas Schnelle
2026-03-19 19:25 ` Guenter Roeck
2026-03-19 20:27 ` Ionut Nechita (Wind River)
2026-03-24 17:07 ` Benjamin Block
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox