* [PATCH v2 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects
@ 2026-03-11 13:27 Benjamin Block
2026-03-11 13:27 ` [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Benjamin Block @ 2026-03-11 13:27 UTC (permalink / raw)
To: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Niklas Schnelle,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali, Benjamin Block
v1 -> v2:
* combine patch 02 and 04 - fix and use of guards [Ilpo, Niklas]
* rephrase description of patch 01 to point out that it is already possible
today to lock/unlock `pci_rescan_remove_lock` anywhere
* added Fixes: tags to patch 03 - the fix
Niklas already mentioned it in his recent comments on discussions about
`pci_rescan_remove_lock` here
https://lore.kernel.org/linux-pci/286d0488aa72b1741f93f900fd5db5c4334a6f50.camel@linux.ibm.com/
and here
https://lore.kernel.org/linux-pci/2b6a844619892ecaa11031705808667e0886d8b2.camel@linux.ibm.com/
; we recently found a couple of deadlocks in the s390 architecture PCI
implementation with hotplug events on our platform.
So far these have not been observed because on s390 it was not usual to have
both PF and attached VFs in the same Linux instance. So far PCI devices have
largely been either available as PF without SR-IOV, or as VF without the PF
being visible in the same instance. This left us with some blind spots w.r.t.
the locking issues here.
This is now changing, and with that we started running into these
deadlocks.
Please Note:
This patchset strictly depends on Ionut Nechita's patch that makes
`pci_lock_rescan_remove()` reentrant:
https://lore.kernel.org/linux-pci/20260310074303.17480-2-ionut.nechita@windriver.com/
Since the discussion so far sounded positive towards the change I decided
to base some of the changes in this patchset on the assumption that his
patch gets merged before mine. Otherwise there will be recursive deadlocks.
Patch 01 helps us insofar it enables us to use lockdep annotations in the
architecture code.
Patch 02 makes it possible to use lock guards for `pci_rescan_remove_lock`.
Patch 03 goes into detail what deadlocks exactly exist today, and fixes them.
I've run a /lot/ of tests with affected PCI adapters:
* enable/disable SR-IOV on the PF;
* run FLR reset on PF and VF;
* run Bus reset on PF and VF;
* run s390's recover SysFS attribute on PF and VF;
* disable/enable power with the hotplug SysFS attribute on PF and VF;
* run `zpcictl` with `--reset`/`--reset-fw` on PF and VF;
* run Configure Off and Configure On on both the PF and VF from a Service
Element.
There is no more deadlocks and no other lockdep warnings I've witnessed.
Benjamin Block (3):
PCI: Move declaration of pci_rescan_remove_lock into public pci.h
PCI: Provide lock guard for pci_rescan_remove_lock
s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device
release
arch/s390/pci/pci.c | 11 ++++++++---
arch/s390/pci/pci_bus.c | 15 ++++++++-------
arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
arch/s390/pci/pci_iov.c | 3 +--
arch/s390/pci/pci_sysfs.c | 9 +++------
drivers/pci/pci.h | 2 --
drivers/pci/probe.c | 1 +
include/linux/pci.h | 5 +++++
8 files changed, 45 insertions(+), 29 deletions(-)
base-commit: b29fb8829bff243512bb8c8908fd39406f9fd4c3
--
2.53.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h
2026-03-11 13:27 [PATCH v2 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects Benjamin Block
@ 2026-03-11 13:27 ` Benjamin Block
2026-03-12 19:41 ` Niklas Schnelle
2026-03-11 13:27 ` [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock Benjamin Block
2026-03-11 13:27 ` [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release Benjamin Block
2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Block @ 2026-03-11 13:27 UTC (permalink / raw)
To: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Niklas Schnelle,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali, Benjamin Block
So far it is possible to use and call the functions
pci_lock_rescan_remove() and pci_unlock_rescan_remove() from any PCI
code, including modules and architecture code; but the lock variable
`pci_rescan_remove_lock` itself is private to objects residing in
`drivers/pci/` via the header `drivers/pci/pci.h`.
This makes it possible to use the lock - lock it, unlock it - from
anywhere, but it is not possible to use lockdep annotations such as
lockdep_assert_held(), or sparse annotations such as __must_hold() in
modules or architecture code for PCI to make the usage more safe.
Since it is useful for `pci_rescan_remove_lock` to have such
annotations, move the variable declaration into `include/linux/pci.h`.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
---
drivers/pci/pci.h | 2 --
drivers/pci/probe.c | 1 +
include/linux/pci.h | 2 ++
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 13d998fbacce..6d611523420f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -110,8 +110,6 @@ struct pcie_tlp_log;
extern const unsigned char pcie_link_speed[];
extern bool pci_early_dump;
-extern struct mutex pci_rescan_remove_lock;
-
bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
bool pcie_cap_has_lnkctl2(const struct pci_dev *dev);
bool pcie_cap_has_rtctl(const struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index bccc7a4bdd79..e5b12878e972 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3509,6 +3509,7 @@ EXPORT_SYMBOL_GPL(pci_rescan_bus);
* routines should always be executed under this mutex.
*/
DEFINE_MUTEX(pci_rescan_remove_lock);
+EXPORT_SYMBOL_GPL(pci_rescan_remove_lock);
void pci_lock_rescan_remove(void)
{
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1c270f1d5123..fd7a962a64ef 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -39,6 +39,7 @@
#include <linux/io.h>
#include <linux/resource_ext.h>
#include <linux/msi_api.h>
+#include <linux/mutex.h>
#include <uapi/linux/pci.h>
#include <linux/pci_ids.h>
@@ -1533,6 +1534,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev);
/* Functions for PCI Hotplug drivers to use */
unsigned int pci_rescan_bus(struct pci_bus *bus);
+extern struct mutex pci_rescan_remove_lock;
void pci_lock_rescan_remove(void);
void pci_unlock_rescan_remove(void);
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock
2026-03-11 13:27 [PATCH v2 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects Benjamin Block
2026-03-11 13:27 ` [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
@ 2026-03-11 13:27 ` Benjamin Block
2026-03-12 19:44 ` Niklas Schnelle
2026-03-11 13:27 ` [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release Benjamin Block
2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Block @ 2026-03-11 13:27 UTC (permalink / raw)
To: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Niklas Schnelle,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali, Benjamin Block
Make it possible to use guard() or scoped_guard() to lock, and
automatically unlock `pci_rescan_remove_lock`.
Since the actual mutex `pci_rescan_remove_lock` is always supposed to be
taken and released using the functions pci_lock_rescan_remove() and
pci_unlock_rescan_remove() it is not possible to simply use the already
existing guards for `struct mutex`. Instead define a new guard
`pci_rescan_remove` that will also call the functions in question, but
is usable via guard() or scoped_guard().
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
---
include/linux/pci.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index fd7a962a64ef..4c41b5a2c90a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -41,6 +41,7 @@
#include <linux/msi_api.h>
#include <linux/mutex.h>
#include <uapi/linux/pci.h>
+#include <linux/cleanup.h>
#include <linux/pci_ids.h>
@@ -1537,6 +1538,8 @@ unsigned int pci_rescan_bus(struct pci_bus *bus);
extern struct mutex pci_rescan_remove_lock;
void pci_lock_rescan_remove(void);
void pci_unlock_rescan_remove(void);
+DEFINE_LOCK_GUARD_0(pci_rescan_remove, pci_lock_rescan_remove(),
+ pci_unlock_rescan_remove());
/* Vital Product Data routines */
ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release
2026-03-11 13:27 [PATCH v2 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects Benjamin Block
2026-03-11 13:27 ` [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
2026-03-11 13:27 ` [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock Benjamin Block
@ 2026-03-11 13:27 ` Benjamin Block
2026-03-12 21:02 ` Niklas Schnelle
2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Block @ 2026-03-11 13:27 UTC (permalink / raw)
To: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Niklas Schnelle,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali, Benjamin Block
When removing PCI device or PCI bus objects there are a couple of
call-chains where it is possible that the kernel runs into a circular or
recursive deadlock involving taking the central `pci_rescan_remove_lock`.
For example:
(A) Thread α receives a CRW event notifying the kernel that a PCI
virtual function has been moved into Reserved state, and so the PCI
subsystem will try to remove that PCI function. The call-chain for that
looks like this:
__zpci_event_availability()
-> zpci_zdev_put() # will lock(zpci_add_remove_lock),
# and lock(zpci_list_lock)
-> zpci_release_device() # will unlock(zpci_list_lock)
-> zpci_cleanup_bus_resources() # will lock(pci_rescan_remove_lock)
Thread β is triggered by userspace writing 0 into the SysFS attribute
`sriov_numvfs` of the parent PCI physical function of the same function
we just try to remove. This will also try to release the PCI virtual
function; but this time the call-chain looks like this:
sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
-> ... (deep chain)
-> pci_release_dev()
-> pcibios_release_device()
-> zpci_zdev_put() # will lock(zpci_add_remove_lock)
If thread α and β coincide, this will result in a cyclic deadlock.
(B) Consider thread β from above; if the thread was to follow the same
outlined call-chain for thread α, and not fall into the cyclic deadlock,
it would recursive deadlock:
sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
-> ... (deep chain)
-> pci_release_dev()
-> pcibios_release_device()
-> zpci_zdev_put() # will lock(zpci_add_remove_lock),
# and lock(zpci_list_lock)
-> zpci_release_device() # will unlock(zpci_list_lock)
-> zpci_cleanup_bus_resources() # will lock(pci_rescan_remove_lock)
(C) And even in case `pci_rescan_remove_lock` was removed from
zpci_cleanup_bus_resources(), the resulting call-chain would recursive
deadlock when it tries to release the associated PCI bus:
sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
-> ... (deep chain)
-> pci_release_dev()
-> pcibios_release_device()
-> zpci_zdev_put() # will lock(zpci_add_remove_lock),
# and lock(zpci_list_lock)
-> zpci_release_device() # will unlock(zpci_list_lock)
-> zpci_bus_device_unregister()
-> zpci_bus_put() # will lock(zbus_list_lock)
-> zpci_bus_release() # will unlock(zbus_list_lock),
# will lock(pci_rescan_remove_lock)
It can also easily be seen that scenario (C) offers the same risk as (A)
for a cyclic deadlock in cases where `pci_rescan_remove_lock` is first
locked, and the PCI bus released under its protection.
`pci_rescan_remove_lock` has to be and is taken at a "high level" in
most call-chains since it is intended to protect/mutual exclude all
rescan and/or removal actions taken in the PCI subsystem. So to prevent
the outlined deadlock scenarios above remove it instead from the "low
level" release function for both the PCI device and PCI bus objects.
Instead, lock `pci_rescan_remove_lock` in all call-chains leading to
those release functions:
* initialization of the PCI subsystem;
* processing of availability events (CRWs) for PCI functions;
* processing of error events (CRWs) for PCI functions;
* architecture specific release PCI device implementation.
Additionally, remove `pci_rescan_remove_lock` from zpci_bus_scan_bus()
since it is now always already taken when called.
Lastly, document the new locking expectations after these changes. Add
sparse and lockdep annotations to functions that previously locked
`pci_rescan_remove_lock` explicitly, making sure the lock is now
already held when called. Additionally also add the annotations to
zpci_zdev_put() and zpci_bus_put() to make sure that every function that
potentially drops the last reference already holds the lock to prevent
surprises.
Fixes: 05bc1be6db4b2 ("s390/pci: create zPCI bus")
Fixes: ab909509850b2 ("PCI: s390: Fix use-after-free of PCI resources with per-function hotplug")
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
---
arch/s390/pci/pci.c | 11 ++++++++---
arch/s390/pci/pci_bus.c | 15 ++++++++-------
arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
arch/s390/pci/pci_iov.c | 3 +--
arch/s390/pci/pci_sysfs.c | 9 +++------
5 files changed, 39 insertions(+), 27 deletions(-)
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 2a430722cbe4..86ef1e516857 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -71,9 +71,11 @@ struct airq_iv *zpci_aif_sbv;
EXPORT_SYMBOL_GPL(zpci_aif_sbv);
void zpci_zdev_put(struct zpci_dev *zdev)
+ __must_hold(&pci_rescan_remove_lock)
{
if (!zdev)
return;
+ lockdep_assert_held(&pci_rescan_remove_lock);
mutex_lock(&zpci_add_remove_lock);
kref_put_lock(&zdev->kref, zpci_release_device, &zpci_list_lock);
mutex_unlock(&zpci_add_remove_lock);
@@ -582,11 +584,13 @@ int zpci_setup_bus_resources(struct zpci_dev *zdev)
}
static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
+ __must_hold(&pci_rescan_remove_lock)
{
struct resource *res;
int i;
- pci_lock_rescan_remove();
+ lockdep_assert_held(&pci_rescan_remove_lock);
+
for (i = 0; i < PCI_STD_NUM_BARS; i++) {
res = zdev->bars[i].res;
if (!res)
@@ -599,7 +603,6 @@ static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
kfree(res);
}
zdev->has_resources = 0;
- pci_unlock_rescan_remove();
}
int pcibios_device_add(struct pci_dev *pdev)
@@ -629,6 +632,7 @@ void pcibios_release_device(struct pci_dev *pdev)
{
struct zpci_dev *zdev = to_zpci(pdev);
+ guard(pci_rescan_remove)();
zpci_unmap_resources(pdev);
zpci_zdev_put(zdev);
}
@@ -1208,7 +1212,8 @@ static int __init pci_base_init(void)
if (rc)
goto out_irq;
- rc = zpci_scan_devices();
+ scoped_guard(pci_rescan_remove)
+ rc = zpci_scan_devices();
if (rc)
goto out_find;
diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c
index 36a4807285fa..c1b48b572e86 100644
--- a/arch/s390/pci/pci_bus.c
+++ b/arch/s390/pci/pci_bus.c
@@ -82,9 +82,8 @@ int zpci_bus_scan_device(struct zpci_dev *zdev)
if (!pdev)
return -ENODEV;
- pci_lock_rescan_remove();
+ guard(pci_rescan_remove)();
pci_bus_add_device(pdev);
- pci_unlock_rescan_remove();
return 0;
}
@@ -132,10 +131,13 @@ void zpci_bus_remove_device(struct zpci_dev *zdev, bool set_error)
* Return: 0 on success, an error value otherwise
*/
int zpci_bus_scan_bus(struct zpci_bus *zbus)
+ __must_hold(&pci_rescan_remove_lock)
{
struct zpci_dev *zdev;
int devfn, rc, ret = 0;
+ lockdep_assert_held(&pci_rescan_remove_lock);
+
for (devfn = 0; devfn < ZPCI_FUNCTIONS_PER_BUS; devfn++) {
zdev = zbus->function[devfn];
if (zdev && zdev->state == ZPCI_FN_STATE_CONFIGURED) {
@@ -145,10 +147,8 @@ int zpci_bus_scan_bus(struct zpci_bus *zbus)
}
}
- pci_lock_rescan_remove();
pci_scan_child_bus(zbus->bus);
pci_bus_add_devices(zbus->bus);
- pci_unlock_rescan_remove();
return ret;
}
@@ -214,11 +214,12 @@ static int zpci_bus_create_pci_bus(struct zpci_bus *zbus, struct zpci_dev *fr, s
* run of the function.
*/
static inline void zpci_bus_release(struct kref *kref)
- __releases(&zbus_list_lock)
+ __releases(&zbus_list_lock) __must_hold(&pci_rescan_remove_lock)
{
struct zpci_bus *zbus = container_of(kref, struct zpci_bus, kref);
lockdep_assert_held(&zbus_list_lock);
+ lockdep_assert_held(&pci_rescan_remove_lock);
list_del(&zbus->bus_next);
mutex_unlock(&zbus_list_lock);
@@ -229,14 +230,12 @@ static inline void zpci_bus_release(struct kref *kref)
*/
if (zbus->bus) {
- pci_lock_rescan_remove();
pci_stop_root_bus(zbus->bus);
zpci_free_domain(zbus->domain_nr);
pci_free_resource_list(&zbus->resources);
pci_remove_root_bus(zbus->bus);
- pci_unlock_rescan_remove();
}
zpci_remove_parent_msi_domain(zbus);
@@ -250,7 +249,9 @@ static inline void __zpci_bus_get(struct zpci_bus *zbus)
}
static inline void zpci_bus_put(struct zpci_bus *zbus)
+ __must_hold(&pci_rescan_remove_lock)
{
+ lockdep_assert_held(&pci_rescan_remove_lock);
kref_put_mutex(&zbus->kref, zpci_bus_release, &zbus_list_lock);
}
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index 839bd91c056e..98253706b591 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -342,6 +342,7 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
no_pdev:
if (zdev)
mutex_unlock(&zdev->state_lock);
+ guard(pci_rescan_remove)();
zpci_zdev_put(zdev);
}
@@ -400,9 +401,11 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
zdev = zpci_create_device(ccdf->fid, ccdf->fh, ZPCI_FN_STATE_CONFIGURED);
if (IS_ERR(zdev))
break;
- if (zpci_add_device(zdev)) {
- kfree(zdev);
- break;
+ scoped_guard(pci_rescan_remove) {
+ if (zpci_add_device(zdev)) {
+ kfree(zdev);
+ break;
+ }
}
} else {
if (zdev->state == ZPCI_FN_STATE_RESERVED)
@@ -419,9 +422,11 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
zdev = zpci_create_device(ccdf->fid, ccdf->fh, ZPCI_FN_STATE_STANDBY);
if (IS_ERR(zdev))
break;
- if (zpci_add_device(zdev)) {
- kfree(zdev);
- break;
+ scoped_guard(pci_rescan_remove) {
+ if (zpci_add_device(zdev)) {
+ kfree(zdev);
+ break;
+ }
}
} else {
if (zdev->state == ZPCI_FN_STATE_RESERVED)
@@ -450,24 +455,29 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
/* The 0x0304 event may immediately reserve the device */
if (!clp_get_state(zdev->fid, &state) &&
state == ZPCI_FN_STATE_RESERVED) {
+ guard(pci_rescan_remove)();
zpci_device_reserved(zdev);
}
}
break;
case 0x0306: /* 0x308 or 0x302 for multiple devices */
- zpci_remove_reserved_devices();
- zpci_scan_devices();
+ scoped_guard(pci_rescan_remove) {
+ zpci_remove_reserved_devices();
+ zpci_scan_devices();
+ }
break;
case 0x0308: /* Standby -> Reserved */
if (!zdev)
break;
- zpci_device_reserved(zdev);
+ scoped_guard(pci_rescan_remove)
+ zpci_device_reserved(zdev);
break;
default:
break;
}
if (existing_zdev) {
mutex_unlock(&zdev->state_lock);
+ guard(pci_rescan_remove)();
zpci_zdev_put(zdev);
}
}
diff --git a/arch/s390/pci/pci_iov.c b/arch/s390/pci/pci_iov.c
index 13050ce5c3e9..1f7e4dd018e7 100644
--- a/arch/s390/pci/pci_iov.c
+++ b/arch/s390/pci/pci_iov.c
@@ -38,10 +38,9 @@ void zpci_iov_map_resources(struct pci_dev *pdev)
void zpci_iov_remove_virtfn(struct pci_dev *pdev, int vfn)
{
- pci_lock_rescan_remove();
+ guard(pci_rescan_remove)();
/* Linux' vfid's start at 0 vfn at 1 */
pci_iov_remove_virtfn(pdev->physfn, vfn - 1);
- pci_unlock_rescan_remove();
}
static int zpci_iov_link_virtfn(struct pci_dev *pdev, struct pci_dev *virtfn, int vfid)
diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c
index c2444a23e26c..f5027aa95928 100644
--- a/arch/s390/pci/pci_sysfs.c
+++ b/arch/s390/pci/pci_sysfs.c
@@ -98,9 +98,9 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr,
WARN_ON_ONCE(!kn);
/* Device needs to be configured and state must not change */
- mutex_lock(&zdev->state_lock);
+ guard(mutex)(&zdev->state_lock);
if (zdev->state != ZPCI_FN_STATE_CONFIGURED)
- goto out;
+ return count;
/* device_remove_file() serializes concurrent calls ignoring all but
* the first
@@ -112,15 +112,12 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr,
* Once it unblocks from pci_lock_rescan_remove() the original pdev
* will already be removed.
*/
- pci_lock_rescan_remove();
+ guard(pci_rescan_remove)();
if (pci_dev_is_added(pdev)) {
ret = _do_recover(pdev, zdev);
}
pci_rescan_bus(zdev->zbus->bus);
- pci_unlock_rescan_remove();
-out:
- mutex_unlock(&zdev->state_lock);
if (kn)
sysfs_unbreak_active_protection(kn);
return ret ? ret : count;
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h
2026-03-11 13:27 ` [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
@ 2026-03-12 19:41 ` Niklas Schnelle
2026-03-13 13:32 ` Benjamin Block
0 siblings, 1 reply; 10+ messages in thread
From: Niklas Schnelle @ 2026-03-12 19:41 UTC (permalink / raw)
To: Benjamin Block, Alexander Gordeev, Gerd Bayer, Bjorn Helgaas,
Vasily Gorbik, Heiko Carstens, Ionut Nechita (Wind River)
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali
On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> So far it is possible to use and call the functions
> pci_lock_rescan_remove() and pci_unlock_rescan_remove() from any PCI
> code, including modules and architecture code; but the lock variable
> `pci_rescan_remove_lock` itself is private to objects residing in
> `drivers/pci/` via the header `drivers/pci/pci.h`.
>
> This makes it possible to use the lock - lock it, unlock it - from
> anywhere, but it is not possible to use lockdep annotations such as
> lockdep_assert_held(), or sparse annotations such as __must_hold() in
> modules or architecture code for PCI to make the usage more safe.
>
> Since it is useful for `pci_rescan_remove_lock` to have such
> annotations, move the variable declaration into `include/linux/pci.h`.
>
> Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
> ---
> drivers/pci/pci.h | 2 --
> drivers/pci/probe.c | 1 +
> include/linux/pci.h | 2 ++
> 3 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 13d998fbacce..6d611523420f 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -110,8 +110,6 @@ struct pcie_tlp_log;
> extern const unsigned char pcie_link_speed[];
> extern bool pci_early_dump;
>
> -extern struct mutex pci_rescan_remove_lock;
> -
> bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
> bool pcie_cap_has_lnkctl2(const struct pci_dev *dev);
> bool pcie_cap_has_rtctl(const struct pci_dev *dev);
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index bccc7a4bdd79..e5b12878e972 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3509,6 +3509,7 @@ EXPORT_SYMBOL_GPL(pci_rescan_bus);
> * routines should always be executed under this mutex.
> */
> DEFINE_MUTEX(pci_rescan_remove_lock);
> +EXPORT_SYMBOL_GPL(pci_rescan_remove_lock);
This has a (rather trivial) merge conflict with Ionut's patch which at
the same time is a prerequisite for this series. Sadly since that isn't
in linux-next yet I'm not sure how to best handle this. Maybe it would
make sense to just include it in this series? @Ionut would that be ok
for you?
>
> void pci_lock_rescan_remove(void)
> {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1c270f1d5123..fd7a962a64ef 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -39,6 +39,7 @@
> #include <linux/io.h>
> #include <linux/resource_ext.h>
> #include <linux/msi_api.h>
> +#include <linux/mutex.h>
> #include <uapi/linux/pci.h>
>
> #include <linux/pci_ids.h>
> @@ -1533,6 +1534,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev);
>
> /* Functions for PCI Hotplug drivers to use */
> unsigned int pci_rescan_bus(struct pci_bus *bus);
> +extern struct mutex pci_rescan_remove_lock;
> void pci_lock_rescan_remove(void);
> void pci_unlock_rescan_remove(void);
>
I do see Keith's argument that proliferation of the rescan/remove lock
is to be minimized. That said, since user's of this header can already
lock/unlock I don't think this patch makes matters worse. In fact we
want this patch to be able to add better lockdep asserts so it will
help against misuse.
With that feel free to add:
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Thanks,
Niklas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock
2026-03-11 13:27 ` [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock Benjamin Block
@ 2026-03-12 19:44 ` Niklas Schnelle
2026-03-13 13:32 ` Benjamin Block
0 siblings, 1 reply; 10+ messages in thread
From: Niklas Schnelle @ 2026-03-12 19:44 UTC (permalink / raw)
To: Benjamin Block, Alexander Gordeev, Gerd Bayer, Bjorn Helgaas,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali
On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> Make it possible to use guard() or scoped_guard() to lock, and
> automatically unlock `pci_rescan_remove_lock`.
>
> Since the actual mutex `pci_rescan_remove_lock` is always supposed to be
> taken and released using the functions pci_lock_rescan_remove() and
> pci_unlock_rescan_remove() it is not possible to simply use the already
> existing guards for `struct mutex`. Instead define a new guard
> `pci_rescan_remove` that will also call the functions in question, but
> is usable via guard() or scoped_guard().
>
> Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
> ---
> include/linux/pci.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fd7a962a64ef..4c41b5a2c90a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -41,6 +41,7 @@
> #include <linux/msi_api.h>
> #include <linux/mutex.h>
> #include <uapi/linux/pci.h>
> +#include <linux/cleanup.h>
>
> #include <linux/pci_ids.h>
>
> @@ -1537,6 +1538,8 @@ unsigned int pci_rescan_bus(struct pci_bus *bus);
> extern struct mutex pci_rescan_remove_lock;
> void pci_lock_rescan_remove(void);
> void pci_unlock_rescan_remove(void);
> +DEFINE_LOCK_GUARD_0(pci_rescan_remove, pci_lock_rescan_remove(),
> + pci_unlock_rescan_remove());
>
> /* Vital Product Data routines */
> ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
Looks good to me. Thank you.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release
2026-03-11 13:27 ` [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release Benjamin Block
@ 2026-03-12 21:02 ` Niklas Schnelle
2026-03-13 13:59 ` Benjamin Block
0 siblings, 1 reply; 10+ messages in thread
From: Niklas Schnelle @ 2026-03-12 21:02 UTC (permalink / raw)
To: Benjamin Block, Alexander Gordeev, Gerd Bayer, Bjorn Helgaas,
Vasily Gorbik, Heiko Carstens
Cc: Gerald Schaefer, Christian Borntraeger, Andreas Krebbel,
linux-pci, linux-kernel, Ionut Nechita, Tobias Schumacher,
Sven Schnelle, Matthew Rosato, linux-s390, Julian Ruess,
Ionut Nechita, Farhan Ali
On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> When removing PCI device or PCI bus objects there are a couple of
> call-chains where it is possible that the kernel runs into a circular or
> recursive deadlock involving taking the central `pci_rescan_remove_lock`.
Now that this depends on Ionut's patch at[0] and exploits the fact that
the rescan/remove lock can be taken recursively we may want to adjust
and thereby shorten this description to take this into account dropping
the recursive locking as deadlock reason.
>
> For example:
>
> (A) Thread α receives a CRW event notifying the kernel that a PCI
Nit: Other PCI related code calls these "PCI events" even though
technically they use the Channel Report Word (CRW) mechanism and thus
are CRW events.
> virtual function has been moved into Reserved state, and so the PCI
> subsystem will try to remove that PCI function. The call-chain for that
> looks like this:
> __zpci_event_availability()
> -> zpci_zdev_put() # will lock(zpci_add_remove_lock),
> # and lock(zpci_list_lock)
> -> zpci_release_device() # will unlock(zpci_list_lock)
> -> zpci_cleanup_bus_resources() # will lock(pci_rescan_remove_lock)
>
> Thread β is triggered by userspace writing 0 into the SysFS attribute
> `sriov_numvfs` of the parent PCI physical function of the same function
> we just try to remove. This will also try to release the PCI virtual
> function; but this time the call-chain looks like this:
> sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
With Ionut's patch a prerequisite the above taking of
pci_rescan_remove_lock now happens in sriov_del_vfs() with a bit
shorter chain but same result.
> -> ... (deep chain)
> -> pci_release_dev()
> -> pcibios_release_device()
> -> zpci_zdev_put() # will lock(zpci_add_remove_lock)
>
> If thread α and β coincide, this will result in a cyclic deadlock.
>
> (B) Consider thread β from above; if the thread was to follow the same
> outlined call-chain for thread α, and not fall into the cyclic deadlock,
> it would recursive deadlock:
> sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
> -> ... (deep chain)
> -> pci_release_dev()
> -> pcibios_release_device()
> -> zpci_zdev_put() # will lock(zpci_add_remove_lock),
> # and lock(zpci_list_lock)
> -> zpci_release_device() # will unlock(zpci_list_lock)
> -> zpci_cleanup_bus_resources() # will lock(pci_rescan_remove_lock)
Since Ionut's patch must come before this, this behavior itself would
be okay (see argument above). Maybve just drop this part?
>
> (C) And even in case `pci_rescan_remove_lock` was removed from
> zpci_cleanup_bus_resources(), the resulting call-chain would recursive
> deadlock when it tries to release the associated PCI bus:
> sriov_numvfs_store() # will lock(pci_rescan_remove_lock)
Same as above after Ionut's change the lock is taken in sriov_del_vfs()
> -> ... (deep chain)
> -> pci_release_dev()
> -> pcibios_release_device()
> -> zpci_zdev_put() # will lock(zpci_add_remove_lock),
> # and lock(zpci_list_lock)
> -> zpci_release_device() # will unlock(zpci_list_lock)
> -> zpci_bus_device_unregister()
> -> zpci_bus_put() # will lock(zbus_list_lock)
> -> zpci_bus_release() # will unlock(zbus_list_lock),
> # will lock(pci_rescan_remove_lock)
Same argument as above for dropping the recursive deadlock part but the
below cyclic deadlock potential remains.
>
> It can also easily be seen that scenario (C) offers the same risk as (A)
> for a cyclic deadlock in cases where `pci_rescan_remove_lock` is first
> locked, and the PCI bus released under its protection.
I think this part should spell out which two locks create the cycle
>
> `pci_rescan_remove_lock` has to be and is taken at a "high level" in
> most call-chains since it is intended to protect/mutual exclude all
> rescan and/or removal actions taken in the PCI subsystem. So to prevent
> the outlined deadlock scenarios above remove it instead from the "low
> level" release function for both the PCI device and PCI bus objects.
>
> Instead, lock `pci_rescan_remove_lock` in all call-chains leading to
> those release functions:
> * initialization of the PCI subsystem;
> * processing of availability events (CRWs) for PCI functions;
> * processing of error events (CRWs) for PCI functions;
> * architecture specific release PCI device implementation.
>
> Additionally, remove `pci_rescan_remove_lock` from zpci_bus_scan_bus()
> since it is now always already taken when called.
Nit: I'd add an explicit statement that the locking order between
pci_rescan_remove_lock and zpci_add_remove_lock is thus reversed with
pci_rescan_remove_lockk always taken first then zpci_add_remove_lock.
Still it is understandable without it too.
>
> Lastly, document the new locking expectations after these changes. Add
> sparse and lockdep annotations to functions that previously locked
> `pci_rescan_remove_lock` explicitly, making sure the lock is now
> already held when called. Additionally also add the annotations to
> zpci_zdev_put() and zpci_bus_put() to make sure that every function that
> potentially drops the last reference already holds the lock to prevent
> surprises.
>
> Fixes: 05bc1be6db4b2 ("s390/pci: create zPCI bus")
> Fixes: ab909509850b2 ("PCI: s390: Fix use-after-free of PCI resources with per-function hotplug")
> Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
> ---
> arch/s390/pci/pci.c | 11 ++++++++---
> arch/s390/pci/pci_bus.c | 15 ++++++++-------
> arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
> arch/s390/pci/pci_iov.c | 3 +--
> arch/s390/pci/pci_sysfs.c | 9 +++------
> 5 files changed, 39 insertions(+), 27 deletions(-)
>
I had not previously looked at your commit message from the point of
view when it is ordered aferr Ionut's patch[0], which it has to for the
release case. So I ended up with a few comments now even though I
previously read it multiple times and was just very happy with the nice
and readable call chain and locking order explanations.
Either way the code itself looks good to me and I also did quite a bit
of testing with this and the previous versions. Trying hot(un)plug and
SR-IOV with various PCI devices and also just ran this and the previous
version on several test systems for a few days doing other testing and
development work.
I also put quite a bit of effort into trying to modify this and/or
other s390 PCI code to not require the reentrant behavior of the
pci_rescan_remove_lock but failed. The really tricky part to me is how
e.g. a driver can hold on to a struct pci_dev even after it has been
removed while we have to keep both the struct zpci_dev and struct
zpci_bus and associated struct pci_bus exactly until the last
pci_dev_put(). And then needing to remove the PCI bus via
zpci_bus_release() we need to have the pci_rescan_remove lock but now
this can happen, in principle, at any pci_dev_put() and that of course
sometimes happens when we already have the pci_rescan_remove_lock and
sometimes without it.
So feel free to add my:
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Thanks,
Niklas
[0]https://lore.kernel.org/lkml/20260310074303.17480-2-ionut.nechita@windriver.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h
2026-03-12 19:41 ` Niklas Schnelle
@ 2026-03-13 13:32 ` Benjamin Block
0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Block @ 2026-03-13 13:32 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Vasily Gorbik,
Heiko Carstens, Ionut Nechita (Wind River), Gerald Schaefer,
Christian Borntraeger, Andreas Krebbel, linux-pci, linux-kernel,
Ionut Nechita, Tobias Schumacher, Sven Schnelle, Matthew Rosato,
linux-s390, Julian Ruess, Farhan Ali
On Thu, Mar 12, 2026 at 08:41:26PM +0100, Niklas Schnelle wrote:
> On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index bccc7a4bdd79..e5b12878e972 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -3509,6 +3509,7 @@ EXPORT_SYMBOL_GPL(pci_rescan_bus);
> > * routines should always be executed under this mutex.
> > */
> > DEFINE_MUTEX(pci_rescan_remove_lock);
> > +EXPORT_SYMBOL_GPL(pci_rescan_remove_lock);
>
> This has a (rather trivial) merge conflict with Ionut's patch which at
> the same time is a prerequisite for this series. Sadly since that isn't
> in linux-next yet I'm not sure how to best handle this. Maybe it would
> make sense to just include it in this series? @Ionut would that be ok
> for you?
Well, I don't mind either way tbf. I can also resend a new version, once other
other patch gets into next/master. That's shouldn't be an issue.
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 1c270f1d5123..fd7a962a64ef 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -39,6 +39,7 @@
> > #include <linux/io.h>
> > #include <linux/resource_ext.h>
> > #include <linux/msi_api.h>
> > +#include <linux/mutex.h>
> > #include <uapi/linux/pci.h>
> >
> > #include <linux/pci_ids.h>
> > @@ -1533,6 +1534,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev);
> >
> > /* Functions for PCI Hotplug drivers to use */
> > unsigned int pci_rescan_bus(struct pci_bus *bus);
> > +extern struct mutex pci_rescan_remove_lock;
> > void pci_lock_rescan_remove(void);
> > void pci_unlock_rescan_remove(void);
> >
>
> I do see Keith's argument that proliferation of the rescan/remove lock
> is to be minimized. That said, since user's of this header can already
> lock/unlock I don't think this patch makes matters worse. In fact we
> want this patch to be able to add better lockdep asserts so it will
> help against misuse.
>
> With that feel free to add:
>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Thanks.
--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller
Sitz der Ges.: Ehningen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock
2026-03-12 19:44 ` Niklas Schnelle
@ 2026-03-13 13:32 ` Benjamin Block
0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Block @ 2026-03-13 13:32 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Vasily Gorbik,
Heiko Carstens, Gerald Schaefer, Christian Borntraeger,
Andreas Krebbel, linux-pci, linux-kernel, Ionut Nechita,
Tobias Schumacher, Sven Schnelle, Matthew Rosato, linux-s390,
Julian Ruess, Ionut Nechita, Farhan Ali
On Thu, Mar 12, 2026 at 08:44:02PM +0100, Niklas Schnelle wrote:
> On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> > Make it possible to use guard() or scoped_guard() to lock, and
> > automatically unlock `pci_rescan_remove_lock`.
> >
> > Since the actual mutex `pci_rescan_remove_lock` is always supposed to be
> > taken and released using the functions pci_lock_rescan_remove() and
> > pci_unlock_rescan_remove() it is not possible to simply use the already
> > existing guards for `struct mutex`. Instead define a new guard
> > `pci_rescan_remove` that will also call the functions in question, but
> > is usable via guard() or scoped_guard().
> >
> > Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
> > ---
> > include/linux/pci.h | 3 +++
> > 1 file changed, 3 insertions(+)
> >
--8<--
>
> Looks good to me. Thank you.
>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Thanks.
--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller
Sitz der Ges.: Ehningen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release
2026-03-12 21:02 ` Niklas Schnelle
@ 2026-03-13 13:59 ` Benjamin Block
0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Block @ 2026-03-13 13:59 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Alexander Gordeev, Gerd Bayer, Bjorn Helgaas, Vasily Gorbik,
Heiko Carstens, Gerald Schaefer, Christian Borntraeger,
Andreas Krebbel, linux-pci, linux-kernel, Ionut Nechita,
Tobias Schumacher, Sven Schnelle, Matthew Rosato, linux-s390,
Julian Ruess, Ionut Nechita, Farhan Ali
On Thu, Mar 12, 2026 at 10:02:59PM +0100, Niklas Schnelle wrote:
> On Wed, 2026-03-11 at 14:27 +0100, Benjamin Block wrote:
> > When removing PCI device or PCI bus objects there are a couple of
> > call-chains where it is possible that the kernel runs into a circular or
> > recursive deadlock involving taking the central `pci_rescan_remove_lock`.
>
> Now that this depends on Ionut's patch at[0] and exploits the fact that
> the rescan/remove lock can be taken recursively we may want to adjust
> and thereby shorten this description to take this into account dropping
> the recursive locking as deadlock reason.
Yeah, fair, with the dependency, case (B) and (C) don't exit anymore.
> >
> > For example:
> >
> > (A) Thread α receives a CRW event notifying the kernel that a PCI
>
> Nit: Other PCI related code calls these "PCI events" even though
> technically they use the Channel Report Word (CRW) mechanism and thus
> are CRW events.
Eh, sure, I can change that. I guess it heavily depends on where you are
coming from. For someone that mainly dealt with "classic CIO" devices talking
about CRW events that concern PCI devices immediately told me where I have to
look in the architecture and such to figure out what's going on. "PCI event"
not so much.
> > Fixes: 05bc1be6db4b2 ("s390/pci: create zPCI bus")
> > Fixes: ab909509850b2 ("PCI: s390: Fix use-after-free of PCI resources with per-function hotplug")
> > Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
> > ---
> > arch/s390/pci/pci.c | 11 ++++++++---
> > arch/s390/pci/pci_bus.c | 15 ++++++++-------
> > arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
> > arch/s390/pci/pci_iov.c | 3 +--
> > arch/s390/pci/pci_sysfs.c | 9 +++------
> > 5 files changed, 39 insertions(+), 27 deletions(-)
> >
>
> I had not previously looked at your commit message from the point of
> view when it is ordered aferr Ionut's patch[0], which it has to for the
> release case. So I ended up with a few comments now even though I
> previously read it multiple times and was just very happy with the nice
> and readable call chain and locking order explanations.
>
> Either way the code itself looks good to me and I also did quite a bit
> of testing with this and the previous versions. Trying hot(un)plug and
> SR-IOV with various PCI devices and also just ran this and the previous
> version on several test systems for a few days doing other testing and
> development work.
>
> I also put quite a bit of effort into trying to modify this and/or
> other s390 PCI code to not require the reentrant behavior of the
> pci_rescan_remove_lock but failed. The really tricky part to me is how
> e.g. a driver can hold on to a struct pci_dev even after it has been
> removed while we have to keep both the struct zpci_dev and struct
> zpci_bus and associated struct pci_bus exactly until the last
> pci_dev_put(). And then needing to remove the PCI bus via
> zpci_bus_release() we need to have the pci_rescan_remove lock but now
> this can happen, in principle, at any pci_dev_put() and that of course
> sometimes happens when we already have the pci_rescan_remove_lock and
> sometimes without it.
>
> So feel free to add my:
>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Thanks.
I'll hold of at re-rolling an other version since we don't have code changes
so far. Hoping for more input. Once I do, I'll adapt the description of this
patch.
--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller
Sitz der Ges.: Ehningen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-13 13:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 13:27 [PATCH v2 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects Benjamin Block
2026-03-11 13:27 ` [PATCH v2 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
2026-03-12 19:41 ` Niklas Schnelle
2026-03-13 13:32 ` Benjamin Block
2026-03-11 13:27 ` [PATCH v2 2/3] PCI: Provide lock guard for pci_rescan_remove_lock Benjamin Block
2026-03-12 19:44 ` Niklas Schnelle
2026-03-13 13:32 ` Benjamin Block
2026-03-11 13:27 ` [PATCH v2 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release Benjamin Block
2026-03-12 21:02 ` Niklas Schnelle
2026-03-13 13:59 ` Benjamin Block
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox