* [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset
2026-02-05 21:25 [PATCHv3 0/4] pci slot reset handling fixes Keith Busch
@ 2026-02-05 21:25 ` Keith Busch
2026-02-06 17:22 ` Keith Busch
2026-02-10 20:44 ` dan.j.williams
2026-02-05 21:25 ` [PATCHv3 2/4] pci: allow all bus devices to use the same slot Keith Busch
` (2 subsequent siblings)
3 siblings, 2 replies; 18+ messages in thread
From: Keith Busch @ 2026-02-05 21:25 UTC (permalink / raw)
To: linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
From: Keith Busch <kbusch@kernel.org>
Make the code a little easier to navigate with more descriptive
function names. The two renamed functions here "try" to do to a reset,
so make that clear in the name to distinguish them from other similarly
named functions.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/pci.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 57a5b205175f1..e00af20ea7376 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5480,7 +5480,7 @@ int pci_probe_reset_slot(struct pci_slot *slot)
EXPORT_SYMBOL_GPL(pci_probe_reset_slot);
/**
- * __pci_reset_slot - Try to reset a PCI slot
+ * pci_try_reset_slot - Try to reset a PCI slot
* @slot: PCI slot to reset
*
* A PCI bus may host multiple slots, each slot may support a reset mechanism
@@ -5494,7 +5494,7 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_slot);
*
* Same as above except return -EAGAIN if the slot cannot be locked
*/
-static int __pci_reset_slot(struct pci_slot *slot)
+static int pci_try_reset_slot(struct pci_slot *slot)
{
int rc;
@@ -5583,12 +5583,12 @@ int pci_probe_reset_bus(struct pci_bus *bus)
EXPORT_SYMBOL_GPL(pci_probe_reset_bus);
/**
- * __pci_reset_bus - Try to reset a PCI bus
+ * pci_try_reset_bus - Try to reset a PCI bus
* @bus: top level PCI bus to reset
*
* Same as above except return -EAGAIN if the bus cannot be locked
*/
-int __pci_reset_bus(struct pci_bus *bus)
+static int pci_try_reset_bus(struct pci_bus *bus)
{
int rc;
@@ -5617,7 +5617,7 @@ int __pci_reset_bus(struct pci_bus *bus)
int pci_reset_bus(struct pci_dev *pdev)
{
return (!pci_probe_reset_slot(pdev->slot)) ?
- __pci_reset_slot(pdev->slot) : __pci_reset_bus(pdev->bus);
+ pci_try_reset_slot(pdev->slot) : pci_try_reset_bus(pdev->bus);
}
EXPORT_SYMBOL_GPL(pci_reset_bus);
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset
2026-02-05 21:25 ` [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset Keith Busch
@ 2026-02-06 17:22 ` Keith Busch
2026-02-10 20:44 ` dan.j.williams
1 sibling, 0 replies; 18+ messages in thread
From: Keith Busch @ 2026-02-06 17:22 UTC (permalink / raw)
To: Keith Busch
Cc: linux-pci, helgaas, alex, lukas, dan.j.williams, guojinhui.liam,
ilpo.jarvinen
On Thu, Feb 05, 2026 at 01:25:30PM -0800, Keith Busch wrote:
> -int __pci_reset_bus(struct pci_bus *bus)
> +static int pci_try_reset_bus(struct pci_bus *bus)
Just fyi, I'm aware this patch should have updated the usage for this
symbold in pci-sysfs.c and pci.h as well to avoid the linking error. I
missed the error because patch 4 gets everything fixed up again.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset
2026-02-05 21:25 ` [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset Keith Busch
2026-02-06 17:22 ` Keith Busch
@ 2026-02-10 20:44 ` dan.j.williams
1 sibling, 0 replies; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 20:44 UTC (permalink / raw)
To: Keith Busch, linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Make the code a little easier to navigate with more descriptive
> function names. The two renamed functions here "try" to do to a reset,
> so make that clear in the name to distinguish them from other similarly
> named functions.
Looks good,
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv3 2/4] pci: allow all bus devices to use the same slot
2026-02-05 21:25 [PATCHv3 0/4] pci slot reset handling fixes Keith Busch
2026-02-05 21:25 ` [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset Keith Busch
@ 2026-02-05 21:25 ` Keith Busch
2026-02-10 20:00 ` dan.j.williams
2026-02-05 21:25 ` [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore Keith Busch
2026-02-05 21:25 ` [PATCHv3 4/4] pci: make reset_subordinate hotplug safe Keith Busch
3 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2026-02-05 21:25 UTC (permalink / raw)
To: linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
From: Keith Busch <kbusch@kernel.org>
A pcie hot plug slot applies to the entire subordinate bus. Thus, pciehp
only allocates a single hotplug slot for the bridge to that bus. The pci
slot, though, would only match to functions on device 0, meaning any
other device beyond that are not matched to any slot even though they
all share the same slot.
Allow a slot to be created to claim all devices on a bus, not just a
matching device. This is done by introducing a sential value, named
PCI_SLOT_ALL_DEVICES, which then has the pci slot match to any device on
the bus. This fixes walking the slot's device list for ports that don't
have device specific slots.
Since 0xff already has special meaning, the chose sentinal value for
this new feature is 0xfe. This will not clash with any actual slot
number since they are limited to 5 bits.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/hotplug/pciehp_core.c | 3 ++-
drivers/pci/slot.c | 22 ++++++++++++++++++----
include/linux/pci.h | 8 +++++++-
3 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index f59baa9129709..d80346d567049 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -79,7 +79,8 @@ static int init_slot(struct controller *ctrl)
snprintf(name, SLOT_NAME_SIZE, "%u", PSN(ctrl));
retval = pci_hp_initialize(&ctrl->hotplug_slot,
- ctrl->pcie->port->subordinate, 0, name);
+ ctrl->pcie->port->subordinate,
+ PCI_SLOT_ALL_DEVICES, name);
if (retval) {
ctrl_err(ctrl, "pci_hp_initialize failed: error %d\n", retval);
kfree(ops);
diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index 50fb3eb595fe6..647f6d2cceef6 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -41,6 +41,10 @@ static ssize_t address_read_file(struct pci_slot *slot, char *buf)
return sysfs_emit(buf, "%04x:%02x\n",
pci_domain_nr(slot->bus),
slot->bus->number);
+ if (slot->number == PCI_SLOT_ALL_DEVICES)
+ return sysfs_emit(buf, "%04x:%02x:00\n",
+ pci_domain_nr(slot->bus),
+ slot->bus->number);
return sysfs_emit(buf, "%04x:%02x:%02x\n",
pci_domain_nr(slot->bus),
@@ -73,7 +77,8 @@ static void pci_slot_release(struct kobject *kobj)
down_read(&pci_bus_sem);
list_for_each_entry(dev, &slot->bus->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (slot->number == PCI_SLOT_ALL_DEVICES ||
+ PCI_SLOT(dev->devfn) == slot->number)
dev->slot = NULL;
up_read(&pci_bus_sem);
@@ -166,7 +171,8 @@ void pci_dev_assign_slot(struct pci_dev *dev)
mutex_lock(&pci_slot_mutex);
list_for_each_entry(slot, &dev->bus->slots, list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (slot->number == PCI_SLOT_ALL_DEVICES ||
+ PCI_SLOT(dev->devfn) == slot->number)
dev->slot = slot;
mutex_unlock(&pci_slot_mutex);
}
@@ -188,7 +194,7 @@ static struct pci_slot *get_slot(struct pci_bus *parent, int slot_nr)
/**
* pci_create_slot - create or increment refcount for physical PCI slot
* @parent: struct pci_bus of parent bridge
- * @slot_nr: PCI_SLOT(pci_dev->devfn) or -1 for placeholder
+ * @slot_nr: PCI_SLOT(pci_dev->devfn), -1 for placeholder, or PCI_SLOT_ALL_DEVICES
* @name: user visible string presented in /sys/bus/pci/slots/<name>
* @hotplug: set if caller is hotplug driver, NULL otherwise
*
@@ -222,6 +228,13 @@ static struct pci_slot *get_slot(struct pci_bus *parent, int slot_nr)
* consist solely of a dddd:bb tuple, where dddd is the PCI domain of the
* %struct pci_bus and bb is the bus number. In other words, the devfn of
* the 'placeholder' slot will not be displayed.
+ *
+ * Bus-wide slots:
+ * For PCIe hotplug, the physical slot encompasses the entire subordinate
+ * bus, not just a single device number. Pass @slot_nr == PCI_SLOT_ALL_DEVICES
+ * to create a slot that matches all devices on the bus. Unlike placeholder
+ * slots, bus-wide slots go through normal slot lookup and reuse existing
+ * slots if present.
*/
struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
const char *name,
@@ -285,7 +298,8 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
down_read(&pci_bus_sem);
list_for_each_entry(dev, &parent->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot_nr)
+ if (slot_nr == PCI_SLOT_ALL_DEVICES ||
+ PCI_SLOT(dev->devfn) == slot_nr)
dev->slot = slot;
up_read(&pci_bus_sem);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 864775651c6fa..d9f85025d97f5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -72,12 +72,18 @@
/* return bus from PCI devid = ((u16)bus_number) << 8) | devfn */
#define PCI_BUS_NUM(x) (((x) >> 8) & 0xff)
+/*
+ * PCI_SLOT_ALL_DEVICES indicates a slot that covers all devices on the bus.
+ * Used for PCIe hotplug where the physical slot is the entire subordinate bus.
+ */
+#define PCI_SLOT_ALL_DEVICES 0xfe
+
/* pci_slot represents a physical slot */
struct pci_slot {
struct pci_bus *bus; /* Bus this slot is on */
struct list_head list; /* Node in list of slots */
struct hotplug_slot *hotplug; /* Hotplug info (move here) */
- unsigned char number; /* PCI_SLOT(pci_dev->devfn) */
+ unsigned char number; /* Device nr, or PCI_SLOT_ALL_DEVICES */
struct kobject kobj;
};
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCHv3 2/4] pci: allow all bus devices to use the same slot
2026-02-05 21:25 ` [PATCHv3 2/4] pci: allow all bus devices to use the same slot Keith Busch
@ 2026-02-10 20:00 ` dan.j.williams
2026-02-10 20:28 ` Keith Busch
0 siblings, 1 reply; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 20:00 UTC (permalink / raw)
To: Keith Busch, linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> A pcie hot plug slot applies to the entire subordinate bus. Thus, pciehp
> only allocates a single hotplug slot for the bridge to that bus. The pci
> slot, though, would only match to functions on device 0, meaning any
> other device beyond that are not matched to any slot even though they
> all share the same slot.
>
> Allow a slot to be created to claim all devices on a bus, not just a
> matching device. This is done by introducing a sential value, named
s/sential/sentinel/
> PCI_SLOT_ALL_DEVICES, which then has the pci slot match to any device on
> the bus. This fixes walking the slot's device list for ports that don't
> have device specific slots.
>
> Since 0xff already has special meaning, the chose sentinal value for
s/sentinal/sentinel/
> this new feature is 0xfe. This will not clash with any actual slot
> number since they are limited to 5 bits.
Can you say a bit more about why other hotplug controller drivers can
continue to get away with a single device number as a hotplug slot
target? For example, would you expect that the ACPI hotplug driver is
broken if an ARI device is added?
It looks like the goal of this sentinel is to turn all
pci_try_reset_slot() into pci_try_reset_bus() only for the native PCIe
hotplug controller, or did I miss a detail?
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/hotplug/pciehp_core.c | 3 ++-
> drivers/pci/slot.c | 22 ++++++++++++++++++----
> include/linux/pci.h | 8 +++++++-
> 3 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
> index f59baa9129709..d80346d567049 100644
> --- a/drivers/pci/hotplug/pciehp_core.c
> +++ b/drivers/pci/hotplug/pciehp_core.c
> @@ -79,7 +79,8 @@ static int init_slot(struct controller *ctrl)
> snprintf(name, SLOT_NAME_SIZE, "%u", PSN(ctrl));
>
> retval = pci_hp_initialize(&ctrl->hotplug_slot,
> - ctrl->pcie->port->subordinate, 0, name);
> + ctrl->pcie->port->subordinate,
> + PCI_SLOT_ALL_DEVICES, name);
> if (retval) {
> ctrl_err(ctrl, "pci_hp_initialize failed: error %d\n", retval);
> kfree(ops);
> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
> index 50fb3eb595fe6..647f6d2cceef6 100644
> --- a/drivers/pci/slot.c
> +++ b/drivers/pci/slot.c
> @@ -41,6 +41,10 @@ static ssize_t address_read_file(struct pci_slot *slot, char *buf)
> return sysfs_emit(buf, "%04x:%02x\n",
> pci_domain_nr(slot->bus),
> slot->bus->number);
> + if (slot->number == PCI_SLOT_ALL_DEVICES)
> + return sysfs_emit(buf, "%04x:%02x:00\n",
> + pci_domain_nr(slot->bus),
> + slot->bus->number);
Why not treat the PCI_SLOT_ALL_DEVICES case the same as the 0xff case?
>
> return sysfs_emit(buf, "%04x:%02x:%02x\n",
> pci_domain_nr(slot->bus),
> @@ -73,7 +77,8 @@ static void pci_slot_release(struct kobject *kobj)
>
> down_read(&pci_bus_sem);
> list_for_each_entry(dev, &slot->bus->devices, bus_list)
> - if (PCI_SLOT(dev->devfn) == slot->number)
> + if (slot->number == PCI_SLOT_ALL_DEVICES ||
> + PCI_SLOT(dev->devfn) == slot->number)
Maybe add a pci_slot_matches() helper for this pattern?
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCHv3 2/4] pci: allow all bus devices to use the same slot
2026-02-10 20:00 ` dan.j.williams
@ 2026-02-10 20:28 ` Keith Busch
2026-02-10 20:51 ` dan.j.williams
0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2026-02-10 20:28 UTC (permalink / raw)
To: dan.j.williams
Cc: Keith Busch, linux-pci, helgaas, alex, lukas, guojinhui.liam,
ilpo.jarvinen
On Tue, Feb 10, 2026 at 12:00:56PM -0800, dan.j.williams@intel.com wrote:
> Keith Busch wrote:
>
> > this new feature is 0xfe. This will not clash with any actual slot
> > number since they are limited to 5 bits.
>
> Can you say a bit more about why other hotplug controller drivers can
> continue to get away with a single device number as a hotplug slot
> target? For example, would you expect that the ACPI hotplug driver is
> broken if an ARI device is added?
Right, I don't think acpiphp would work correctly for such a device. I
didn't change that out an abundance of caution to not introduce
regressions to platforms I can't readily test.
I've limited this patch to pciehp since that's the only hotplug driver
that hard-codes the "devnr" value to 0, and ensures there's only one
pciehp "hotplug_slot" per bridge. pciehp is uniquely clear that this
patch is safe.
The remaining hotplug_slot users derive devnr and can make additional
hotplug_slots for each device on the bus, so I can't tell if it is safe
to make the others subscribe to this proposed behavior.
> It looks like the goal of this sentinel is to turn all
> pci_try_reset_slot() into pci_try_reset_bus() only for the native PCIe
> hotplug controller, or did I miss a detail?
Yes, pretty much that. Since every device on the subordinate bus belongs
to the same pciehp slot now, we'll get the same locking, notification,
and state restore as what pci_try_reset_bus() does.
> > @@ -41,6 +41,10 @@ static ssize_t address_read_file(struct pci_slot *slot, char *buf)
> > return sysfs_emit(buf, "%04x:%02x\n",
> > pci_domain_nr(slot->bus),
> > slot->bus->number);
> > + if (slot->number == PCI_SLOT_ALL_DEVICES)
> > + return sysfs_emit(buf, "%04x:%02x:00\n",
> > + pci_domain_nr(slot->bus),
> > + slot->bus->number);
>
> Why not treat the PCI_SLOT_ALL_DEVICES case the same as the 0xff case?
Just preserving backward compatibility with what this file would show
for pciehp slots prior to this patch.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 2/4] pci: allow all bus devices to use the same slot
2026-02-10 20:28 ` Keith Busch
@ 2026-02-10 20:51 ` dan.j.williams
0 siblings, 0 replies; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 20:51 UTC (permalink / raw)
To: Keith Busch, dan.j.williams
Cc: Keith Busch, linux-pci, helgaas, alex, lukas, guojinhui.liam,
ilpo.jarvinen
Keith Busch wrote:
> On Tue, Feb 10, 2026 at 12:00:56PM -0800, dan.j.williams@intel.com wrote:
> > Keith Busch wrote:
> >
> > > this new feature is 0xfe. This will not clash with any actual slot
> > > number since they are limited to 5 bits.
> >
> > Can you say a bit more about why other hotplug controller drivers can
> > continue to get away with a single device number as a hotplug slot
> > target? For example, would you expect that the ACPI hotplug driver is
> > broken if an ARI device is added?
>
> Right, I don't think acpiphp would work correctly for such a device. I
> didn't change that out an abundance of caution to not introduce
> regressions to platforms I can't readily test.
>
> I've limited this patch to pciehp since that's the only hotplug driver
> that hard-codes the "devnr" value to 0, and ensures there's only one
> pciehp "hotplug_slot" per bridge. pciehp is uniquely clear that this
> patch is safe.
>
> The remaining hotplug_slot users derive devnr and can make additional
> hotplug_slots for each device on the bus, so I can't tell if it is safe
> to make the others subscribe to this proposed behavior.
Ok, I figured as much, maybe just put these notes in the changelog.
> > It looks like the goal of this sentinel is to turn all
> > pci_try_reset_slot() into pci_try_reset_bus() only for the native PCIe
> > hotplug controller, or did I miss a detail?
>
> Yes, pretty much that. Since every device on the subordinate bus belongs
> to the same pciehp slot now, we'll get the same locking, notification,
> and state restore as what pci_try_reset_bus() does.
>
> > > @@ -41,6 +41,10 @@ static ssize_t address_read_file(struct pci_slot *slot, char *buf)
> > > return sysfs_emit(buf, "%04x:%02x\n",
> > > pci_domain_nr(slot->bus),
> > > slot->bus->number);
> > > + if (slot->number == PCI_SLOT_ALL_DEVICES)
> > > + return sysfs_emit(buf, "%04x:%02x:00\n",
> > > + pci_domain_nr(slot->bus),
> > > + slot->bus->number);
> >
> > Why not treat the PCI_SLOT_ALL_DEVICES case the same as the 0xff case?
>
> Just preserving backward compatibility with what this file would show
> for pciehp slots prior to this patch.
Makes sense, perhaps a comment:
/*
* Preserve legacy ABI expectations that hotplug drivers that manage
* multiple devices per slot emit 0 for the device number.
*/
...just to remember that this is due to caution.
With that you can add:
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-05 21:25 [PATCHv3 0/4] pci slot reset handling fixes Keith Busch
2026-02-05 21:25 ` [PATCHv3 1/4] pci: rename __pci_bus_reset and __pci_slot_reset Keith Busch
2026-02-05 21:25 ` [PATCHv3 2/4] pci: allow all bus devices to use the same slot Keith Busch
@ 2026-02-05 21:25 ` Keith Busch
2026-02-10 22:03 ` dan.j.williams
2026-02-10 23:46 ` Alex Williamson
2026-02-05 21:25 ` [PATCHv3 4/4] pci: make reset_subordinate hotplug safe Keith Busch
3 siblings, 2 replies; 18+ messages in thread
From: Keith Busch @ 2026-02-05 21:25 UTC (permalink / raw)
To: linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
From: Keith Busch <kbusch@kernel.org>
The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
every function on every "D", not just the ones with a matching "slot".
The slot lock/unlock and save/restore functions, however, are only
handling a subset of the functions, breaking the rest.
ARI devices with more than 8 functions fail because their state is not
properly handled, nor is the attached driver notified of the reset. In
the best case, the device will appear unresponsive to the driver,
resulting in unexpected errors. A worse possibility may panic the kernel
if in flight transactions trigger hardware reported errors like this
real observation:
vfio-pci 0000:01:00.0: resetting
vfio-pci 0000:01:00.0: reset done
{1}[Hardware Error]: Error 1, type: fatal
{1}[Hardware Error]: section_type: PCIe error
{1}[Hardware Error]: port_type: 0, PCIe end point
{1}[Hardware Error]: version: 0.2
{1}[Hardware Error]: command: 0x0140, status: 0x0010
{1}[Hardware Error]: device_id: 0000:01:01.0
{1}[Hardware Error]: slot: 0
{1}[Hardware Error]: secondary_bus: 0x00
{1}[Hardware Error]: vendor_id: 0x1d9b, device_id: 0x0207
{1}[Hardware Error]: class_code: 020000
{1}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
{1}[Hardware Error]: aer_cor_status: 0x00008000, aer_cor_mask: 0x00002000
{1}[Hardware Error]: aer_uncor_status: 0x00010000, aer_uncor_mask: 0x00100000
{1}[Hardware Error]: aer_uncor_severity: 0x006f6030
{1}[Hardware Error]: TLP Header: 0a412800 00192080 60000004 00000004
GHES: Fatal hardware error but panic disabled
Kernel panic - not syncing: GHES: Fatal hardware error
Fix this by properly locking and notifying the entire affected bus
topology, not just specific matching slots. For architectures that
support "slot" specific resets, this patch potentially introduces an
insignificant amount of overhead, but is otherwise harmless.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/pci.c | 147 ++++------------------------------------------
1 file changed, 11 insertions(+), 136 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e00af20ea7376..df9ed73dad416 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5287,96 +5287,6 @@ static int pci_bus_trylock(struct pci_bus *bus)
return 0;
}
-/* Do any devices on or below this slot prevent a bus reset? */
-static bool pci_slot_resettable(struct pci_slot *slot)
-{
- struct pci_dev *dev, *bridge = slot->bus->self;
-
- if (bridge && (bridge->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
- return false;
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
- (dev->subordinate && !pci_bus_resettable(dev->subordinate)))
- return false;
- }
-
- return true;
-}
-
-/* Lock devices from the top of the tree down */
-static void pci_slot_lock(struct pci_slot *slot)
-{
- struct pci_dev *dev, *bridge = slot->bus->self;
-
- if (bridge)
- pci_dev_lock(bridge);
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- if (dev->subordinate)
- pci_bus_lock(dev->subordinate);
- else
- pci_dev_lock(dev);
- }
-}
-
-/* Unlock devices from the bottom of the tree up */
-static void pci_slot_unlock(struct pci_slot *slot)
-{
- struct pci_dev *dev, *bridge = slot->bus->self;
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- if (dev->subordinate)
- pci_bus_unlock(dev->subordinate);
- else
- pci_dev_unlock(dev);
- }
-
- if (bridge)
- pci_dev_unlock(bridge);
-}
-
-/* Return 1 on successful lock, 0 on contention */
-static int pci_slot_trylock(struct pci_slot *slot)
-{
- struct pci_dev *dev, *bridge = slot->bus->self;
-
- if (bridge && !pci_dev_trylock(bridge))
- return 0;
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- if (dev->subordinate) {
- if (!pci_bus_trylock(dev->subordinate))
- goto unlock;
- } else if (!pci_dev_trylock(dev))
- goto unlock;
- }
- return 1;
-
-unlock:
- list_for_each_entry_continue_reverse(dev,
- &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- if (dev->subordinate)
- pci_bus_unlock(dev->subordinate);
- else
- pci_dev_unlock(dev);
- }
-
- if (bridge)
- pci_dev_unlock(bridge);
- return 0;
-}
-
/*
* Save and disable devices from the top of the tree down while holding
* the @dev mutex lock for the entire tree.
@@ -5410,59 +5320,23 @@ static void pci_bus_restore_locked(struct pci_bus *bus)
}
}
-/*
- * Save and disable devices from the top of the tree down while holding
- * the @dev mutex lock for the entire tree.
- */
-static void pci_slot_save_and_disable_locked(struct pci_slot *slot)
-{
- struct pci_dev *dev;
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- pci_dev_save_and_disable(dev);
- if (dev->subordinate)
- pci_bus_save_and_disable_locked(dev->subordinate);
- }
-}
-
-/*
- * Restore devices from top of the tree down while holding @dev mutex lock
- * for the entire tree. Parent bridges need to be restored before we can
- * get to subordinate devices.
- */
-static void pci_slot_restore_locked(struct pci_slot *slot)
-{
- struct pci_dev *dev;
-
- list_for_each_entry(dev, &slot->bus->devices, bus_list) {
- if (!dev->slot || dev->slot != slot)
- continue;
- pci_dev_restore(dev);
- if (dev->subordinate) {
- pci_bridge_wait_for_secondary_bus(dev, "slot reset");
- pci_bus_restore_locked(dev->subordinate);
- }
- }
-}
-
static int pci_slot_reset(struct pci_slot *slot, bool probe)
{
+ struct pci_bus *bus = slot ? slot->bus : NULL;
int rc;
- if (!slot || !pci_slot_resettable(slot))
+ if (!slot || !bus || !pci_bus_resettable(bus))
return -ENOTTY;
if (!probe)
- pci_slot_lock(slot);
+ pci_bus_lock(bus);
might_sleep();
rc = pci_reset_hotplug_slot(slot->hotplug, probe);
if (!probe)
- pci_slot_unlock(slot);
+ pci_bus_unlock(bus);
return rc;
}
@@ -5489,25 +5363,26 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_slot);
* wrap the bus reset to avoid spurious slot related events such as hotplug.
* Generally a slot reset should be attempted before a bus reset. All of the
* function of the slot and any subordinate buses behind the slot are reset
- * through this function. PCI config space of all devices in the slot and
- * behind the slot is saved before and restored after reset.
+ * through this function. PCI config space of all devices below the slot bus
+ * are saved before and restored after reset.
*
* Same as above except return -EAGAIN if the slot cannot be locked
*/
static int pci_try_reset_slot(struct pci_slot *slot)
{
+ struct pci_bus *bus = slot->bus;
int rc;
rc = pci_slot_reset(slot, PCI_RESET_PROBE);
if (rc)
return rc;
- if (pci_slot_trylock(slot)) {
- pci_slot_save_and_disable_locked(slot);
+ if (pci_bus_trylock(bus)) {
+ pci_bus_save_and_disable_locked(bus);
might_sleep();
rc = pci_reset_hotplug_slot(slot->hotplug, PCI_RESET_DO_RESET);
- pci_slot_restore_locked(slot);
- pci_slot_unlock(slot);
+ pci_bus_restore_locked(bus);
+ pci_bus_unlock(bus);
} else
rc = -EAGAIN;
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-05 21:25 ` [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore Keith Busch
@ 2026-02-10 22:03 ` dan.j.williams
2026-02-10 23:25 ` Keith Busch
2026-02-10 23:46 ` Alex Williamson
1 sibling, 1 reply; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 22:03 UTC (permalink / raw)
To: Keith Busch, linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
Maybe change "pci" and "pcie" to "pciehp" above to make it clear this problem is
specific to the native PCIe hotplug driver?
> every function on every "D", not just the ones with a matching "slot".
> The slot lock/unlock and save/restore functions, however, are only
> handling a subset of the functions, breaking the rest.
>
> ARI devices with more than 8 functions fail because their state is not
> properly handled, nor is the attached driver notified of the reset. In
> the best case, the device will appear unresponsive to the driver,
> resulting in unexpected errors. A worse possibility may panic the kernel
> if in flight transactions trigger hardware reported errors like this
> real observation:
>
> vfio-pci 0000:01:00.0: resetting
> vfio-pci 0000:01:00.0: reset done
> {1}[Hardware Error]: Error 1, type: fatal
> {1}[Hardware Error]: section_type: PCIe error
> {1}[Hardware Error]: port_type: 0, PCIe end point
> {1}[Hardware Error]: version: 0.2
> {1}[Hardware Error]: command: 0x0140, status: 0x0010
> {1}[Hardware Error]: device_id: 0000:01:01.0
> {1}[Hardware Error]: slot: 0
> {1}[Hardware Error]: secondary_bus: 0x00
> {1}[Hardware Error]: vendor_id: 0x1d9b, device_id: 0x0207
> {1}[Hardware Error]: class_code: 020000
> {1}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
> {1}[Hardware Error]: aer_cor_status: 0x00008000, aer_cor_mask: 0x00002000
> {1}[Hardware Error]: aer_uncor_status: 0x00010000, aer_uncor_mask: 0x00100000
> {1}[Hardware Error]: aer_uncor_severity: 0x006f6030
> {1}[Hardware Error]: TLP Header: 0a412800 00192080 60000004 00000004
> GHES: Fatal hardware error but panic disabled
> Kernel panic - not syncing: GHES: Fatal hardware error
>
> Fix this by properly locking and notifying the entire affected bus
> topology, not just specific matching slots. For architectures that
> support "slot" specific resets, this patch potentially introduces an
> insignificant amount of overhead, but is otherwise harmless.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/pci.c | 147 ++++------------------------------------------
> 1 file changed, 11 insertions(+), 136 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e00af20ea7376..df9ed73dad416 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
[..]
> @@ -5489,25 +5363,26 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_slot);
> * wrap the bus reset to avoid spurious slot related events such as hotplug.
> * Generally a slot reset should be attempted before a bus reset. All of the
> * function of the slot and any subordinate buses behind the slot are reset
> - * through this function. PCI config space of all devices in the slot and
> - * behind the slot is saved before and restored after reset.
> + * through this function. PCI config space of all devices below the slot bus
> + * are saved before and restored after reset.
> *
> * Same as above except return -EAGAIN if the slot cannot be locked
> */
> static int pci_try_reset_slot(struct pci_slot *slot)
> {
> + struct pci_bus *bus = slot->bus;
Might ->bus be NULL here?
With that clarified / handled you can add:
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-10 22:03 ` dan.j.williams
@ 2026-02-10 23:25 ` Keith Busch
2026-02-10 23:48 ` dan.j.williams
0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2026-02-10 23:25 UTC (permalink / raw)
To: dan.j.williams
Cc: Keith Busch, linux-pci, helgaas, alex, lukas, guojinhui.liam,
ilpo.jarvinen
On Tue, Feb 10, 2026 at 02:03:54PM -0800, dan.j.williams@intel.com wrote:
> Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> > PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
>
> Maybe change "pci" and "pcie" to "pciehp" above to make it clear this problem is
> specific to the native PCIe hotplug driver?
Good point, and given that, should I just drop this patch? The previous
one fixes up this slot assignment issue for pciehp, so we could consider
taking either "2" or "3", but not necessarily both. I included both in
this series because I'm not sure if there is a preference. My best
reasoning to take both is if we want to proactively handle the entire
bus for all platforms requesting a slot reset, but also leave us an easy
revert if it causes a problem.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-10 23:25 ` Keith Busch
@ 2026-02-10 23:48 ` dan.j.williams
0 siblings, 0 replies; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 23:48 UTC (permalink / raw)
To: Keith Busch, dan.j.williams
Cc: Keith Busch, linux-pci, helgaas, alex, lukas, guojinhui.liam,
ilpo.jarvinen
Keith Busch wrote:
> On Tue, Feb 10, 2026 at 02:03:54PM -0800, dan.j.williams@intel.com wrote:
> > Keith Busch wrote:
> > > From: Keith Busch <kbusch@kernel.org>
> > >
> > > The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> > > PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
> >
> > Maybe change "pci" and "pcie" to "pciehp" above to make it clear this problem is
> > specific to the native PCIe hotplug driver?
>
> Good point, and given that, should I just drop this patch? The previous
> one fixes up this slot assignment issue for pciehp, so we could consider
> taking either "2" or "3", but not necessarily both. I included both in
> this series because I'm not sure if there is a preference. My best
> reasoning to take both is if we want to proactively handle the entire
> bus for all platforms requesting a slot reset, but also leave us an easy
> revert if it causes a problem.
My only comment there would be that the bug fix that happened in "2" was
subtle. So I would say at a minimum clarify "2" as a fix for pciehp and
that "3" is a opportunistic fix for other reset scenarios. ...if I
understand the incremental value of "3" correctly.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-05 21:25 ` [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore Keith Busch
2026-02-10 22:03 ` dan.j.williams
@ 2026-02-10 23:46 ` Alex Williamson
2026-02-11 0:12 ` Keith Busch
1 sibling, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2026-02-10 23:46 UTC (permalink / raw)
To: Keith Busch
Cc: linux-pci, helgaas, lukas, dan.j.williams, guojinhui.liam,
ilpo.jarvinen, Keith Busch
On Thu, 5 Feb 2026 13:25:32 -0800
Keith Busch <kbusch@meta.com> wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
> every function on every "D", not just the ones with a matching "slot".
> The slot lock/unlock and save/restore functions, however, are only
> handling a subset of the functions, breaking the rest.
So are we deprecating conventional PCI hotplug slots? I've always
understood these to be a subset of the bus, where we find the devices
sharing the same physical PCI slot by comparing the pci_dev.slot across
the bus, as done in all the code removed here. pci_create_slot()
certainly is not simply a reflection of PCI_SLOT(). Is the error below
really a reflection that pci_dev.slot isn't carried over to the full
set of functions? Thanks,
Alex
> ARI devices with more than 8 functions fail because their state is not
> properly handled, nor is the attached driver notified of the reset. In
> the best case, the device will appear unresponsive to the driver,
> resulting in unexpected errors. A worse possibility may panic the kernel
> if in flight transactions trigger hardware reported errors like this
> real observation:
>
> vfio-pci 0000:01:00.0: resetting
> vfio-pci 0000:01:00.0: reset done
> {1}[Hardware Error]: Error 1, type: fatal
> {1}[Hardware Error]: section_type: PCIe error
> {1}[Hardware Error]: port_type: 0, PCIe end point
> {1}[Hardware Error]: version: 0.2
> {1}[Hardware Error]: command: 0x0140, status: 0x0010
> {1}[Hardware Error]: device_id: 0000:01:01.0
> {1}[Hardware Error]: slot: 0
> {1}[Hardware Error]: secondary_bus: 0x00
> {1}[Hardware Error]: vendor_id: 0x1d9b, device_id: 0x0207
> {1}[Hardware Error]: class_code: 020000
> {1}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
> {1}[Hardware Error]: aer_cor_status: 0x00008000, aer_cor_mask: 0x00002000
> {1}[Hardware Error]: aer_uncor_status: 0x00010000, aer_uncor_mask: 0x00100000
> {1}[Hardware Error]: aer_uncor_severity: 0x006f6030
> {1}[Hardware Error]: TLP Header: 0a412800 00192080 60000004 00000004
> GHES: Fatal hardware error but panic disabled
> Kernel panic - not syncing: GHES: Fatal hardware error
>
> Fix this by properly locking and notifying the entire affected bus
> topology, not just specific matching slots. For architectures that
> support "slot" specific resets, this patch potentially introduces an
> insignificant amount of overhead, but is otherwise harmless.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/pci.c | 147 ++++------------------------------------------
> 1 file changed, 11 insertions(+), 136 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e00af20ea7376..df9ed73dad416 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5287,96 +5287,6 @@ static int pci_bus_trylock(struct pci_bus *bus)
> return 0;
> }
>
> -/* Do any devices on or below this slot prevent a bus reset? */
> -static bool pci_slot_resettable(struct pci_slot *slot)
> -{
> - struct pci_dev *dev, *bridge = slot->bus->self;
> -
> - if (bridge && (bridge->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
> - return false;
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
> - (dev->subordinate && !pci_bus_resettable(dev->subordinate)))
> - return false;
> - }
> -
> - return true;
> -}
> -
> -/* Lock devices from the top of the tree down */
> -static void pci_slot_lock(struct pci_slot *slot)
> -{
> - struct pci_dev *dev, *bridge = slot->bus->self;
> -
> - if (bridge)
> - pci_dev_lock(bridge);
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - if (dev->subordinate)
> - pci_bus_lock(dev->subordinate);
> - else
> - pci_dev_lock(dev);
> - }
> -}
> -
> -/* Unlock devices from the bottom of the tree up */
> -static void pci_slot_unlock(struct pci_slot *slot)
> -{
> - struct pci_dev *dev, *bridge = slot->bus->self;
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - if (dev->subordinate)
> - pci_bus_unlock(dev->subordinate);
> - else
> - pci_dev_unlock(dev);
> - }
> -
> - if (bridge)
> - pci_dev_unlock(bridge);
> -}
> -
> -/* Return 1 on successful lock, 0 on contention */
> -static int pci_slot_trylock(struct pci_slot *slot)
> -{
> - struct pci_dev *dev, *bridge = slot->bus->self;
> -
> - if (bridge && !pci_dev_trylock(bridge))
> - return 0;
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - if (dev->subordinate) {
> - if (!pci_bus_trylock(dev->subordinate))
> - goto unlock;
> - } else if (!pci_dev_trylock(dev))
> - goto unlock;
> - }
> - return 1;
> -
> -unlock:
> - list_for_each_entry_continue_reverse(dev,
> - &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - if (dev->subordinate)
> - pci_bus_unlock(dev->subordinate);
> - else
> - pci_dev_unlock(dev);
> - }
> -
> - if (bridge)
> - pci_dev_unlock(bridge);
> - return 0;
> -}
> -
> /*
> * Save and disable devices from the top of the tree down while holding
> * the @dev mutex lock for the entire tree.
> @@ -5410,59 +5320,23 @@ static void pci_bus_restore_locked(struct pci_bus *bus)
> }
> }
>
> -/*
> - * Save and disable devices from the top of the tree down while holding
> - * the @dev mutex lock for the entire tree.
> - */
> -static void pci_slot_save_and_disable_locked(struct pci_slot *slot)
> -{
> - struct pci_dev *dev;
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - pci_dev_save_and_disable(dev);
> - if (dev->subordinate)
> - pci_bus_save_and_disable_locked(dev->subordinate);
> - }
> -}
> -
> -/*
> - * Restore devices from top of the tree down while holding @dev mutex lock
> - * for the entire tree. Parent bridges need to be restored before we can
> - * get to subordinate devices.
> - */
> -static void pci_slot_restore_locked(struct pci_slot *slot)
> -{
> - struct pci_dev *dev;
> -
> - list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> - if (!dev->slot || dev->slot != slot)
> - continue;
> - pci_dev_restore(dev);
> - if (dev->subordinate) {
> - pci_bridge_wait_for_secondary_bus(dev, "slot reset");
> - pci_bus_restore_locked(dev->subordinate);
> - }
> - }
> -}
> -
> static int pci_slot_reset(struct pci_slot *slot, bool probe)
> {
> + struct pci_bus *bus = slot ? slot->bus : NULL;
> int rc;
>
> - if (!slot || !pci_slot_resettable(slot))
> + if (!slot || !bus || !pci_bus_resettable(bus))
> return -ENOTTY;
>
> if (!probe)
> - pci_slot_lock(slot);
> + pci_bus_lock(bus);
>
> might_sleep();
>
> rc = pci_reset_hotplug_slot(slot->hotplug, probe);
>
> if (!probe)
> - pci_slot_unlock(slot);
> + pci_bus_unlock(bus);
>
> return rc;
> }
> @@ -5489,25 +5363,26 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_slot);
> * wrap the bus reset to avoid spurious slot related events such as hotplug.
> * Generally a slot reset should be attempted before a bus reset. All of the
> * function of the slot and any subordinate buses behind the slot are reset
> - * through this function. PCI config space of all devices in the slot and
> - * behind the slot is saved before and restored after reset.
> + * through this function. PCI config space of all devices below the slot bus
> + * are saved before and restored after reset.
> *
> * Same as above except return -EAGAIN if the slot cannot be locked
> */
> static int pci_try_reset_slot(struct pci_slot *slot)
> {
> + struct pci_bus *bus = slot->bus;
> int rc;
>
> rc = pci_slot_reset(slot, PCI_RESET_PROBE);
> if (rc)
> return rc;
>
> - if (pci_slot_trylock(slot)) {
> - pci_slot_save_and_disable_locked(slot);
> + if (pci_bus_trylock(bus)) {
> + pci_bus_save_and_disable_locked(bus);
> might_sleep();
> rc = pci_reset_hotplug_slot(slot->hotplug, PCI_RESET_DO_RESET);
> - pci_slot_restore_locked(slot);
> - pci_slot_unlock(slot);
> + pci_bus_restore_locked(bus);
> + pci_bus_unlock(bus);
> } else
> rc = -EAGAIN;
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-10 23:46 ` Alex Williamson
@ 2026-02-11 0:12 ` Keith Busch
2026-02-11 15:22 ` Alex Williamson
0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2026-02-11 0:12 UTC (permalink / raw)
To: Alex Williamson
Cc: Keith Busch, linux-pci, helgaas, lukas, dan.j.williams,
guojinhui.liam, ilpo.jarvinen
On Tue, Feb 10, 2026 at 04:46:23PM -0700, Alex Williamson wrote:
> On Thu, 5 Feb 2026 13:25:32 -0800
> Keith Busch <kbusch@meta.com> wrote:
>
> > From: Keith Busch <kbusch@kernel.org>
> >
> > The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> > PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
> > every function on every "D", not just the ones with a matching "slot".
> > The slot lock/unlock and save/restore functions, however, are only
> > handling a subset of the functions, breaking the rest.
>
> So are we deprecating conventional PCI hotplug slots?
No, this still calls the slot's specific reset callback via
pci_reset_hotplug_slot. The only thing this does is add locking, config
state save/restore, and driver reset prepare/complete calls for the
entire bus rather than matching "slots" on that bus. In the worst case
scenario, this patch has the kernel do a little more work than it needed
to.
> I've always
> understood these to be a subset of the bus, where we find the devices
> sharing the same physical PCI slot by comparing the pci_dev.slot across
> the bus, as done in all the code removed here. pci_create_slot()
> certainly is not simply a reflection of PCI_SLOT().
pci_create_slot assigns devices that match PCI_SLOT() of the bus's
devices, so it certainly is a reflection of that, no? The problem is
that PCI_SLOT() macro doesn't actually reflect the true nature of what
devices belong in the slot: the devices in a particular "slot" does not
define the actual blast radius of resetting that "slot".
> Is the error below
> really a reflection that pci_dev.slot isn't carried over to the full
> set of functions?
Correct, the pci slot today in a pciehp hotplug slot is not getting
assigned to all the functions in that slot, so slot resets completely
miss handling those functions, which inevitably break them.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-11 0:12 ` Keith Busch
@ 2026-02-11 15:22 ` Alex Williamson
2026-02-11 15:54 ` Keith Busch
0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2026-02-11 15:22 UTC (permalink / raw)
To: Keith Busch
Cc: Keith Busch, linux-pci, helgaas, lukas, dan.j.williams,
guojinhui.liam, ilpo.jarvinen
On Tue, 10 Feb 2026 17:12:13 -0700
Keith Busch <kbusch@kernel.org> wrote:
> On Tue, Feb 10, 2026 at 04:46:23PM -0700, Alex Williamson wrote:
> > On Thu, 5 Feb 2026 13:25:32 -0800
> > Keith Busch <kbusch@meta.com> wrote:
> >
> > > From: Keith Busch <kbusch@kernel.org>
> > >
> > > The Linux pci driver resolves a "slot" to the "D" in the B:D.f (see
> > > PCI_SLOT()). A pcie "slot reset" is a secondary bus reset, which affects
> > > every function on every "D", not just the ones with a matching "slot".
> > > The slot lock/unlock and save/restore functions, however, are only
> > > handling a subset of the functions, breaking the rest.
> >
> > So are we deprecating conventional PCI hotplug slots?
>
> No, this still calls the slot's specific reset callback via
> pci_reset_hotplug_slot. The only thing this does is add locking, config
> state save/restore, and driver reset prepare/complete calls for the
> entire bus rather than matching "slots" on that bus. In the worst case
> scenario, this patch has the kernel do a little more work than it needed
> to.
But with this change, when vfio-pci calls pci_probe_reset_slot() we
test pci_bus_resettable() rather than pci_slot_resettable(). The
former is a superset of the latter. Another slot on the same bus might
have a bridge card or quirked device that previously wouldn't have
affected the ability to reset a separate physical, conventional slot.
The risk is small, but the semantics are different.
> > I've always
> > understood these to be a subset of the bus, where we find the devices
> > sharing the same physical PCI slot by comparing the pci_dev.slot across
> > the bus, as done in all the code removed here. pci_create_slot()
> > certainly is not simply a reflection of PCI_SLOT().
>
> pci_create_slot assigns devices that match PCI_SLOT() of the bus's
> devices, so it certainly is a reflection of that, no? The problem is
> that PCI_SLOT() macro doesn't actually reflect the true nature of what
> devices belong in the slot: the devices in a particular "slot" does not
> define the actual blast radius of resetting that "slot".
struct pci_slot, the thing created by pci_create_slot(), represents a
physical slot. PCI_SLOT() is the bus addressing. Not all pci_devs
have a struct pci_slot, not all pci_devs within the same physical slot
share the same B:D.F slot number (with ARI).
The vfio code does assume that 'slot' defines the blast radius of a
pci_slot_reset() and I'm confused how it should operate if we're now
going to conflate slot and bus reset scopes.
> > Is the error below
> > really a reflection that pci_dev.slot isn't carried over to the full
> > set of functions?
>
> Correct, the pci slot today in a pciehp hotplug slot is not getting
> assigned to all the functions in that slot, so slot resets completely
> miss handling those functions, which inevitably break them.
Isn't this then a pciehp hotplug issue that all children of the bus
should be linked via the struct pci_slot regardless of PCI_SLOT()?
Thanks,
Alex
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore
2026-02-11 15:22 ` Alex Williamson
@ 2026-02-11 15:54 ` Keith Busch
0 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2026-02-11 15:54 UTC (permalink / raw)
To: Alex Williamson
Cc: Keith Busch, linux-pci, helgaas, lukas, dan.j.williams,
guojinhui.liam, ilpo.jarvinen
On Wed, Feb 11, 2026 at 08:22:29AM -0700, Alex Williamson wrote:
> On Tue, 10 Feb 2026 17:12:13 -0700
> Keith Busch <kbusch@kernel.org> wrote:
> > No, this still calls the slot's specific reset callback via
> > pci_reset_hotplug_slot. The only thing this does is add locking, config
> > state save/restore, and driver reset prepare/complete calls for the
> > entire bus rather than matching "slots" on that bus. In the worst case
> > scenario, this patch has the kernel do a little more work than it needed
> > to.
>
> But with this change, when vfio-pci calls pci_probe_reset_slot() we
> test pci_bus_resettable() rather than pci_slot_resettable(). The
> former is a superset of the latter. Another slot on the same bus might
> have a bridge card or quirked device that previously wouldn't have
> affected the ability to reset a separate physical, conventional slot.
> The risk is small, but the semantics are different.
Thanks, good point.
> > > I've always
> > > understood these to be a subset of the bus, where we find the devices
> > > sharing the same physical PCI slot by comparing the pci_dev.slot across
> > > the bus, as done in all the code removed here. pci_create_slot()
> > > certainly is not simply a reflection of PCI_SLOT().
> >
> > pci_create_slot assigns devices that match PCI_SLOT() of the bus's
> > devices, so it certainly is a reflection of that, no? The problem is
> > that PCI_SLOT() macro doesn't actually reflect the true nature of what
> > devices belong in the slot: the devices in a particular "slot" does not
> > define the actual blast radius of resetting that "slot".
>
> struct pci_slot, the thing created by pci_create_slot(), represents a
> physical slot. PCI_SLOT() is the bus addressing. Not all pci_devs
> have a struct pci_slot, not all pci_devs within the same physical slot
> share the same B:D.F slot number (with ARI).
>
> The vfio code does assume that 'slot' defines the blast radius of a
> pci_slot_reset() and I'm confused how it should operate if we're now
> going to conflate slot and bus reset scopes.
Yes, the "struct pci_slot" is that represetation. The problem is that
its current form is missing semantics to bind to functions that don't
match PCI_SLOT(). I don't know why PCI_SLOT is that criteria, but it's
clearly the wrong criteria for pciehp, probably others too if they
support ARI devices. The previous patch in this series creates entirely
new semantics specifically for pciehp and fixes that issue. It also
makes this patch unnecessary, but *only* for pciehp.
> > > Is the error below
> > > really a reflection that pci_dev.slot isn't carried over to the full
> > > set of functions?
> >
> > Correct, the pci slot today in a pciehp hotplug slot is not getting
> > assigned to all the functions in that slot, so slot resets completely
> > miss handling those functions, which inevitably break them.
>
> Isn't this then a pciehp hotplug issue that all children of the bus
> should be linked via the struct pci_slot regardless of PCI_SLOT()?
It's not really a pciehp issue because pci_slot doesn't make it possible
to do anything else. See [PATCH 2/4] from this series.
In light of these concerns, I think the safest way forward is to drop
this patch and just enable something like what patch 2 provides.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv3 4/4] pci: make reset_subordinate hotplug safe
2026-02-05 21:25 [PATCHv3 0/4] pci slot reset handling fixes Keith Busch
` (2 preceding siblings ...)
2026-02-05 21:25 ` [PATCHv3 3/4] pci: remove slot specific lock/unlock and save/restore Keith Busch
@ 2026-02-05 21:25 ` Keith Busch
2026-02-10 22:14 ` dan.j.williams
3 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2026-02-05 21:25 UTC (permalink / raw)
To: linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
From: Keith Busch <kbusch@kernel.org>
Use the slot reset method when resetting the bridge if the bus contains
hot plug slots. This fixes spurious hot plug events that are triggered
by the secondary bus reset that bypasses the slot's detection disabling.
Resetting a bridge's subordinate bus can be done like this:
# echo 1 > /sys/bus/pci/devices/0000:50:01.0/reset_subordinate
Prior to this patch, an example kernel message may show something like:
pcieport 0000:50:01.0: pciehp: Slot(40): Link Down
With this change, the pciehp driver ignores the link event during the
reset, so may show this message instead:
pcieport 0000:50:01.0: pciehp: Slot(40): Link Down/Up ignored
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/pci-sysfs.c | 3 +-
drivers/pci/pci.c | 71 ++++++++++++++++++++++++++++-------------
drivers/pci/pci.h | 2 +-
3 files changed, 50 insertions(+), 26 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2df915ad2d29..039d3dc157006 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -553,7 +553,6 @@ static ssize_t reset_subordinate_store(struct device *dev,
const char *buf, size_t count)
{
struct pci_dev *pdev = to_pci_dev(dev);
- struct pci_bus *bus = pdev->subordinate;
unsigned long val;
if (!capable(CAP_SYS_ADMIN))
@@ -563,7 +562,7 @@ static ssize_t reset_subordinate_store(struct device *dev,
return -EINVAL;
if (val) {
- int ret = __pci_reset_bus(bus);
+ int ret = pci_try_reset_bridge(pdev);
if (ret)
return ret;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index df9ed73dad416..27817c6603a5c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -51,6 +51,10 @@ unsigned int pci_pm_d3hot_delay;
static void pci_pme_list_scan(struct work_struct *work);
+#define PCI_SLOT_RESET_MASKED true
+#define PCI_SLOT_RESET_UNMASKED false
+static int pci_reset_bridge(struct pci_dev *bridge, bool save);
+
static LIST_HEAD(pci_pme_list);
static DEFINE_MUTEX(pci_pme_list_mutex);
static DECLARE_DELAYED_WORK(pci_pme_work, pci_pme_list_scan);
@@ -5420,29 +5424,7 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
*/
int pci_bus_error_reset(struct pci_dev *bridge)
{
- struct pci_bus *bus = bridge->subordinate;
- struct pci_slot *slot;
-
- if (!bus)
- return -ENOTTY;
-
- mutex_lock(&pci_slot_mutex);
- if (list_empty(&bus->slots))
- goto bus_reset;
-
- list_for_each_entry(slot, &bus->slots, list)
- if (pci_probe_reset_slot(slot))
- goto bus_reset;
-
- list_for_each_entry(slot, &bus->slots, list)
- if (pci_slot_reset(slot, PCI_RESET_DO_RESET))
- goto bus_reset;
-
- mutex_unlock(&pci_slot_mutex);
- return 0;
-bus_reset:
- mutex_unlock(&pci_slot_mutex);
- return pci_bus_reset(bridge->subordinate, PCI_RESET_DO_RESET);
+ return pci_reset_bridge(bridge, PCI_SLOT_RESET_UNMASKED);
}
/**
@@ -5483,6 +5465,49 @@ static int pci_try_reset_bus(struct pci_bus *bus)
return rc;
}
+static int pci_reset_bridge(struct pci_dev *bridge, bool masked)
+{
+ struct pci_bus *bus = bridge->subordinate;
+ struct pci_slot *slot;
+
+ if (!bus)
+ return -ENOTTY;
+
+ mutex_lock(&pci_slot_mutex);
+ if (list_empty(&bus->slots))
+ goto bus_reset;
+
+ list_for_each_entry(slot, &bus->slots, list)
+ if (pci_probe_reset_slot(slot))
+ goto bus_reset;
+
+ list_for_each_entry(slot, &bus->slots, list) {
+ int ret;
+
+ if (masked)
+ ret = pci_try_reset_slot(slot);
+ else
+ ret = pci_slot_reset(slot, PCI_RESET_DO_RESET);
+
+ if (ret)
+ goto bus_reset;
+ }
+
+ mutex_unlock(&pci_slot_mutex);
+ return 0;
+bus_reset:
+ mutex_unlock(&pci_slot_mutex);
+
+ if (masked)
+ return pci_try_reset_bus(bus);
+ return pci_bus_reset(bridge->subordinate, PCI_RESET_DO_RESET);
+}
+
+int pci_try_reset_bridge(struct pci_dev *bridge)
+{
+ return pci_reset_bridge(bridge, PCI_SLOT_RESET_MASKED);
+}
+
/**
* pci_reset_bus - Try to reset a PCI bus
* @pdev: top level PCI device to reset via slot/bus
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 0e67014aa0013..790f2350b890c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -197,7 +197,7 @@ bool pci_reset_supported(struct pci_dev *dev);
void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
-int __pci_reset_bus(struct pci_bus *bus);
+int pci_try_reset_bridge(struct pci_dev *dev);
struct pci_cap_saved_data {
u16 cap_nr;
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCHv3 4/4] pci: make reset_subordinate hotplug safe
2026-02-05 21:25 ` [PATCHv3 4/4] pci: make reset_subordinate hotplug safe Keith Busch
@ 2026-02-10 22:14 ` dan.j.williams
0 siblings, 0 replies; 18+ messages in thread
From: dan.j.williams @ 2026-02-10 22:14 UTC (permalink / raw)
To: Keith Busch, linux-pci, helgaas
Cc: alex, lukas, dan.j.williams, guojinhui.liam, ilpo.jarvinen,
Keith Busch
Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Use the slot reset method when resetting the bridge if the bus contains
> hot plug slots. This fixes spurious hot plug events that are triggered
> by the secondary bus reset that bypasses the slot's detection disabling.
>
> Resetting a bridge's subordinate bus can be done like this:
>
> # echo 1 > /sys/bus/pci/devices/0000:50:01.0/reset_subordinate
>
> Prior to this patch, an example kernel message may show something like:
>
> pcieport 0000:50:01.0: pciehp: Slot(40): Link Down
>
> With this change, the pciehp driver ignores the link event during the
> reset, so may show this message instead:
>
> pcieport 0000:50:01.0: pciehp: Slot(40): Link Down/Up ignored
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/pci-sysfs.c | 3 +-
> drivers/pci/pci.c | 71 ++++++++++++++++++++++++++++-------------
> drivers/pci/pci.h | 2 +-
> 3 files changed, 50 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index c2df915ad2d29..039d3dc157006 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -553,7 +553,6 @@ static ssize_t reset_subordinate_store(struct device *dev,
> const char *buf, size_t count)
> {
> struct pci_dev *pdev = to_pci_dev(dev);
> - struct pci_bus *bus = pdev->subordinate;
> unsigned long val;
>
> if (!capable(CAP_SYS_ADMIN))
> @@ -563,7 +562,7 @@ static ssize_t reset_subordinate_store(struct device *dev,
> return -EINVAL;
>
> if (val) {
> - int ret = __pci_reset_bus(bus);
> + int ret = pci_try_reset_bridge(pdev);
>
> if (ret)
> return ret;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index df9ed73dad416..27817c6603a5c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -51,6 +51,10 @@ unsigned int pci_pm_d3hot_delay;
>
> static void pci_pme_list_scan(struct work_struct *work);
>
> +#define PCI_SLOT_RESET_MASKED true
> +#define PCI_SLOT_RESET_UNMASKED false
> +static int pci_reset_bridge(struct pci_dev *bridge, bool save);
> +
> static LIST_HEAD(pci_pme_list);
> static DEFINE_MUTEX(pci_pme_list_mutex);
> static DECLARE_DELAYED_WORK(pci_pme_work, pci_pme_list_scan);
> @@ -5420,29 +5424,7 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
> */
> int pci_bus_error_reset(struct pci_dev *bridge)
> {
> - struct pci_bus *bus = bridge->subordinate;
> - struct pci_slot *slot;
> -
> - if (!bus)
> - return -ENOTTY;
> -
> - mutex_lock(&pci_slot_mutex);
> - if (list_empty(&bus->slots))
> - goto bus_reset;
> -
> - list_for_each_entry(slot, &bus->slots, list)
> - if (pci_probe_reset_slot(slot))
> - goto bus_reset;
> -
> - list_for_each_entry(slot, &bus->slots, list)
> - if (pci_slot_reset(slot, PCI_RESET_DO_RESET))
> - goto bus_reset;
> -
> - mutex_unlock(&pci_slot_mutex);
> - return 0;
> -bus_reset:
> - mutex_unlock(&pci_slot_mutex);
> - return pci_bus_reset(bridge->subordinate, PCI_RESET_DO_RESET);
> + return pci_reset_bridge(bridge, PCI_SLOT_RESET_UNMASKED);
> }
>
> /**
> @@ -5483,6 +5465,49 @@ static int pci_try_reset_bus(struct pci_bus *bus)
> return rc;
> }
>
> +static int pci_reset_bridge(struct pci_dev *bridge, bool masked)
Perhaps a comment clarifying that @masked refers to hotplug events, or a
comment to say "see PCI_SLOT_RESET_MASKED", and put a clarification
there.
> +{
> + struct pci_bus *bus = bridge->subordinate;
> + struct pci_slot *slot;
> +
> + if (!bus)
> + return -ENOTTY;
> +
> + mutex_lock(&pci_slot_mutex);
> + if (list_empty(&bus->slots))
> + goto bus_reset;
> +
> + list_for_each_entry(slot, &bus->slots, list)
> + if (pci_probe_reset_slot(slot))
> + goto bus_reset;
> +
> + list_for_each_entry(slot, &bus->slots, list) {
> + int ret;
> +
> + if (masked)
> + ret = pci_try_reset_slot(slot);
> + else
> + ret = pci_slot_reset(slot, PCI_RESET_DO_RESET);
> +
> + if (ret)
> + goto bus_reset;
> + }
> +
> + mutex_unlock(&pci_slot_mutex);
This feels like it wants a helper with a guard() inside to avoid goto,
but I can be convinced that are already too PCI reset helpers. This
otherwise looks ok.
With the @masked comment fixed up at a minimum you can add:
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
^ permalink raw reply [flat|nested] 18+ messages in thread