public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects
@ 2026-04-22 14:37 Benjamin Block
  2026-04-22 14:37 ` [PATCH v4 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Benjamin Block @ 2026-04-22 14:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Niklas Schnelle, Tobias Schumacher, linux-s390, Heiko Carstens,
	Ionut Nechita, Sven Schnelle, Ionut Nechita, Farhan Ali,
	Alexander Gordeev, Julian Ruess, Andreas Krebbel, Gerd Bayer,
	Vasily Gorbik, linux-pci, linux-kernel, Christian Borntraeger,
	Matthew Rosato, Benjamin Block

v3 -> v4:
    * remove internal tracking ID from patch 03
v2 -> v3:
    * added Reviewed-by and Tested-by from Niklas
    * base series on current version of Ionut's patch series
      "PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock"
      https://lore.kernel.org/linux-pci/cover.1776839248.git.ionut.nechita%40windriver.com/T/#
      to prevent small merge-conflict in patch 01
    * adapted description of patch 03 so it reflects the point that the series
      is now based on Ionut's patch series, and certain deadlocks can't happen
      anymore (recursive), but others still can (the AB-BA cyclic variants)
v1 -> v2:
    * combine patch 02 and 04 - fix and use of guards [Ilpo, Niklas]
    * rephrase description of patch 01 to point out that it is already possible
      today to lock/unlock `pci_rescan_remove_lock` anywhere
    * added Fixes: tags to patch 03 - the fix

Niklas already mentioned it in his recent comments on discussions about
`pci_rescan_remove_lock` here
https://lore.kernel.org/linux-pci/286d0488aa72b1741f93f900fd5db5c4334a6f50.camel@linux.ibm.com/
and here
https://lore.kernel.org/linux-pci/2b6a844619892ecaa11031705808667e0886d8b2.camel@linux.ibm.com/
; we recently found a couple of deadlocks in the s390 architecture PCI
implementation with hotplug events on our platform.

So far these have not been observed because on s390 it was not usual to have
both PF and attached VFs in the same Linux instance. So far PCI devices have
largely been either available as PF without SR-IOV, or as VF without the PF
being visible in the same instance. This left us with some blind spots w.r.t.
the locking issues here.
    This is now changing, and with that we started running into these
deadlocks.

Please Note:
    This patchset strictly depends on Ionut Nechita's patch that makes
    `pci_lock_rescan_remove()` reentrant:
    https://lore.kernel.org/linux-pci/cover.1776839248.git.ionut.nechita%40windriver.com/T/#

    Since the discussion so far sounded positive towards the change I decided
    to base some of the changes in this patchset on the assumption that his
    patch gets merged before mine. Otherwise there will be recursive deadlocks.

Patch 01 helps us insofar it enables us to use lockdep annotations in the
         architecture code.
Patch 02 makes it possible to use lock guards for `pci_rescan_remove_lock`.
Patch 03 goes into detail what deadlocks exactly exist today, and fixes them.

I've run a /lot/ of tests with affected PCI adapters:
    * enable/disable SR-IOV on the PF;
    * run FLR reset on PF and VF;
    * run Bus reset on PF and VF;
    * run s390's recover SysFS attribute on PF and VF;
    * remove/re-add PCI devices via the `remove` SysFS attribute;
    * unbind/re-bind PCI devices to the vfio-pci device driver;
    * disable/enable power with the hotplug SysFS attribute on PF and VF;
    * run `zpcictl` with `--reset`/`--reset-fw` on PF and VF;
    * remove/re-add vfio modules with bound PCI devices;
    * run Configure Off and Configure On on both the PF and VF from a Service
      Element.

There is no more deadlocks and no other lockdep warnings I've witnessed.

Benjamin Block (3):
  PCI: Move declaration of pci_rescan_remove_lock into public pci.h
  PCI: Provide lock guard for pci_rescan_remove_lock
  s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device
    release

 arch/s390/pci/pci.c       | 11 ++++++++---
 arch/s390/pci/pci_bus.c   | 15 ++++++++-------
 arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
 arch/s390/pci/pci_iov.c   |  3 +--
 arch/s390/pci/pci_sysfs.c |  9 +++------
 drivers/pci/pci.h         |  2 --
 drivers/pci/probe.c       |  1 +
 include/linux/pci.h       |  5 +++++
 8 files changed, 45 insertions(+), 29 deletions(-)


base-commit: 028ef9c96e96197026887c0f092424679298aae8
prerequisite-patch-id: 04db39c9d883c6d06c9b2400fc445c62177f1c5b
prerequisite-patch-id: 68e07de292969a95b72a26153893281558c3eb0d
-- 
2.54.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-22 14:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 14:37 [PATCH v4 0/3] PCI: s390/pci: Fix deadlocks on s390 when releasing zPCI-bus or -device objects Benjamin Block
2026-04-22 14:37 ` [PATCH v4 1/3] PCI: Move declaration of pci_rescan_remove_lock into public pci.h Benjamin Block
2026-04-22 14:37 ` [PATCH v4 2/3] PCI: Provide lock guard for pci_rescan_remove_lock Benjamin Block
2026-04-22 14:37 ` [PATCH v4 3/3] s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device release Benjamin Block

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox