* [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x
@ 2026-06-22 17:18 Farhan Ali
2026-06-22 17:18 ` [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 Farhan Ali
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Farhan Ali @ 2026-06-22 17:18 UTC (permalink / raw)
To: linux-s390, linux-kernel, linux-pci
Cc: helgaas, alex, alifm, schnelle, mjrosato
Hi Bjorn,
This patch set includes only the PCI patches of the original series for
error recovery for vfio-pci devices on s390x [1]. Breaking up the patch
series into PCI and VFIO only patches to make merging easier based on
discussion with Alex [2].
Thanks
Farhan
[1] https://lore.kernel.org/all/20260520171113.1111-1-alifm@linux.ibm.com/
[2] https://lore.kernel.org/all/20260602163344.1eda12d2@shazbot.org/
ChangeLog
---------
v19 https://lore.kernel.org/all/20260615183524.2880-1-alifm@linux.ibm.com/
v19 -> v20
- Unconditionally enable Memory bit while restoring MSI-X (patch 4).
Fixes an issue found with sashiko.
v18 https://lore.kernel.org/all/20260603181647.2215-1-alifm@linux.ibm.com/
v18 -> v19
- Move config space accessible check to pcie_flr() function (based on
discussion of Sashiko review)
- Fix a gap in MSI-X restoration (patch 4).
- Rebase on 7.1-rc7
v17 -> v18
- Rebase on 7.1-rc6.
Farhan Ali (4):
PCI: Allow per function PCI slots to fix slot reset on s390
PCI: Avoid saving config space state if inaccessible
PCI: Fail FLR when config space is inaccessible
PCI/MSI: Enable memory decoding before restoring MSI-X messages
drivers/pci/hotplug/rpaphp_slot.c | 2 +-
drivers/pci/msi/msi.c | 4 ++++
drivers/pci/pci.c | 32 ++++++++++++++++++++++++++++--
drivers/pci/slot.c | 33 +++++++++++++++++++++++--------
include/linux/pci.h | 8 ++++++--
5 files changed, 66 insertions(+), 13 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 2026-06-22 17:18 [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x Farhan Ali @ 2026-06-22 17:18 ` Farhan Ali 2026-06-22 17:32 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible Farhan Ali ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Farhan Ali @ 2026-06-22 17:18 UTC (permalink / raw) To: linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, alifm, schnelle, mjrosato, stable On s390 systems, which use a machine level hypervisor, PCI devices are always accessed through a form of PCI pass-through which fundamentally operates on a per PCI function granularity. This is also reflected in the s390 PCI hotplug driver which creates hotplug slots for individual PCI functions. Its reset_slot() function, which is a wrapper for zpci_hot_reset_device(), thus also resets individual functions. Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object to multifunction devices. This approach worked fine on s390 systems that only exposed virtual functions as individual PCI domains to the operating system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions") s390 supports exposing the topology of multifunction PCI devices by grouping them in a shared PCI domain. This creates a problem when resetting a function through the hotplug driver's slot_reset() interface. When attempting to reset a function through the hotplug driver, the shared slot assignment causes the wrong function to be reset instead of the intended one. It also leaks memory as we do create a pci_slot object for the function, but don't correctly free it in pci_slot_release(). Add a flag for struct pci_slot to allow per function PCI slots for functions managed through a hypervisor, which exposes individual PCI functions while retaining the topology. Since we can use all 8 bits for slot 'number' (for ARI devices), change slot 'number' u16 to account for special values -1 and PCI_SLOT_ALL_DEVICES. Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions") Cc: stable@vger.kernel.org Suggested-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> --- drivers/pci/hotplug/rpaphp_slot.c | 2 +- drivers/pci/pci.c | 5 +++-- drivers/pci/slot.c | 33 +++++++++++++++++++++++-------- include/linux/pci.h | 8 ++++++-- 4 files changed, 35 insertions(+), 13 deletions(-) diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c index 67362e5b9971..92eabf5f61b9 100644 --- a/drivers/pci/hotplug/rpaphp_slot.c +++ b/drivers/pci/hotplug/rpaphp_slot.c @@ -84,7 +84,7 @@ int rpaphp_register_slot(struct slot *slot) struct hotplug_slot *php_slot = &slot->hotplug_slot; u32 my_index; int retval; - int slotno = -1; + int slotno = PCI_SLOT_PLACEHOLDER; dbg("%s registering slot:path[%pOF] index[%x], name[%s] pdomain[%x] type[%d]\n", __func__, slot->dn, slot->index, slot->name, diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d34266651ad0..f5f8291482b0 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4865,8 +4865,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe) static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe) { - if (dev->multifunction || dev->subordinate || !dev->slot || - dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET) + if (dev->subordinate || !dev->slot || + dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET || + (dev->multifunction && !dev->slot->per_func_slot)) return -ENOTTY; return pci_reset_hotplug_slot(dev->slot->hotplug, probe); diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c index 6d5cd37bfb1e..894d6213ed30 100644 --- a/drivers/pci/slot.c +++ b/drivers/pci/slot.c @@ -37,7 +37,7 @@ static const struct sysfs_ops pci_slot_sysfs_ops = { static ssize_t address_read_file(struct pci_slot *slot, char *buf) { - if (slot->number == 0xff) + if (slot->number == (u16)PCI_SLOT_PLACEHOLDER) return sysfs_emit(buf, "%04x:%02x\n", pci_domain_nr(slot->bus), slot->bus->number); @@ -72,6 +72,23 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf) return bus_speed_read(slot->bus->cur_bus_speed, buf); } +static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot) +{ + if (slot->per_func_slot) + return dev->devfn == slot->number; + + return slot->number == PCI_SLOT_ALL_DEVICES || + PCI_SLOT(dev->devfn) == slot->number; +} + +static bool pci_slot_enabled_per_func(void) +{ + if (IS_ENABLED(CONFIG_S390)) + return true; + + return false; +} + static void pci_slot_release(struct kobject *kobj) { struct pci_dev *dev; @@ -82,8 +99,7 @@ static void pci_slot_release(struct kobject *kobj) down_read(&pci_bus_sem); list_for_each_entry(dev, &slot->bus->devices, bus_list) - if (slot->number == PCI_SLOT_ALL_DEVICES || - PCI_SLOT(dev->devfn) == slot->number) + if (pci_dev_matches_slot(dev, slot)) dev->slot = NULL; up_read(&pci_bus_sem); @@ -187,8 +203,7 @@ void pci_dev_assign_slot(struct pci_dev *dev) mutex_lock(&pci_slot_mutex); list_for_each_entry(slot, &dev->bus->slots, list) - if (slot->number == PCI_SLOT_ALL_DEVICES || - PCI_SLOT(dev->devfn) == slot->number) + if (pci_dev_matches_slot(dev, slot)) dev->slot = slot; mutex_unlock(&pci_slot_mutex); } @@ -267,7 +282,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, mutex_lock(&pci_slot_mutex); - if (slot_nr == -1) + if (slot_nr == PCI_SLOT_PLACEHOLDER) goto placeholder; /* @@ -298,6 +313,9 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, slot->bus = pci_bus_get(parent); slot->number = slot_nr; + if (pci_slot_enabled_per_func()) + slot->per_func_slot = 1; + slot->kobj.kset = pci_slots_kset; slot_name = make_slot_name(name); @@ -318,8 +336,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, down_read(&pci_bus_sem); list_for_each_entry(dev, &parent->devices, bus_list) - if (slot_nr == PCI_SLOT_ALL_DEVICES || - PCI_SLOT(dev->devfn) == slot_nr) + if (pci_dev_matches_slot(dev, slot)) dev->slot = slot; up_read(&pci_bus_sem); diff --git a/include/linux/pci.h b/include/linux/pci.h index 2c4454583c11..d58982aa8730 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -78,14 +78,18 @@ * and, if ARI Forwarding is enabled, functions may appear to be on multiple * devices. */ -#define PCI_SLOT_ALL_DEVICES 0xfe +#define PCI_SLOT_ALL_DEVICES 0xfeff + +/* Used to identify a slot as a placeholder */ +#define PCI_SLOT_PLACEHOLDER -1 /* pci_slot represents a physical slot */ struct pci_slot { struct pci_bus *bus; /* Bus this slot is on */ struct list_head list; /* Node in list of slots */ struct hotplug_slot *hotplug; /* Hotplug info (move here) */ - unsigned char number; /* Device nr, or PCI_SLOT_ALL_DEVICES */ + u16 number; /* Device nr, or PCI_SLOT_ALL_DEVICES */ + unsigned int per_func_slot:1; /* Allow per function slot */ struct kobject kobj; }; -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 2026-06-22 17:18 ` [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 Farhan Ali @ 2026-06-22 17:32 ` sashiko-bot 0 siblings, 0 replies; 13+ messages in thread From: sashiko-bot @ 2026-06-22 17:32 UTC (permalink / raw) To: Farhan Ali Cc: linux-pci, Alexander Gordeev, linux-s390, Vasily Gorbik, Heiko Carstens Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider: New issues: - [High] Unconditional `per_func_slot` on s390 breaks generic PCI hotplug drivers and allows destructive cross-function link resets. - [Medium] The sysfs `address` attribute outputs malformed PCI addresses for per-function slots. Pre-existing issues: - [High] `pci_slot_release` modifies the bus slots list without holding `pci_slot_mutex` when released asynchronously via sysfs, causing list corruption. - [High] Lockless access to `dev->slot` in `pci_dev_reset_slot_function` causes a Use-After-Free. - [High] Data race on the `dev->slot` pointer assignment leads to dangling pointers and Use-After-Free. -- commit 5ad46c8f890cdf7764a704f6e7d219e5cc4897bc Author: Farhan Ali <alifm@linux.ibm.com> PCI: Allow per function PCI slots to fix slot reset on s390 This commit adds a flag to struct pci_slot to allow per-function PCI slots on s390 systems. This prevents the wrong function from being reset when grouping multifunction devices in a shared domain, and modifies slot number tracking to support special placeholder values. > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 8f7cfcc000901..d0c9f0166af5c 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4865,8 +4865,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe) > > static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe) > { > - if (dev->multifunction || dev->subordinate || !dev->slot || > - dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET) > + if (dev->subordinate || !dev->slot || > + dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET || > + (dev->multifunction && !dev->slot->per_func_slot)) > return -ENOTTY; [Severity: High] This is a pre-existing issue, but does lockless access to dev->slot in pci_dev_reset_slot_function() risk a use-after-free? If a concurrent asynchronous slot release happens via sysfs hotplug removal, could pci_slot_release() set dev->slot to NULL and free the slot memory exactly between the !dev->slot check and its subsequent dereference? > diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c > index 6d5cd37bfb1e7..894d6213ed306 100644 > --- a/drivers/pci/slot.c > +++ b/drivers/pci/slot.c > @@ -37,7 +37,7 @@ static const struct sysfs_ops pci_slot_sysfs_ops = { > > static ssize_t address_read_file(struct pci_slot *slot, char *buf) > { > - if (slot->number == 0xff) > + if (slot->number == (u16)PCI_SLOT_PLACEHOLDER) > return sysfs_emit(buf, "%04x:%02x\n", > pci_domain_nr(slot->bus), > slot->bus->number); [Severity: Medium] Does this change cause address_read_file() to output malformed PCI addresses in sysfs for per-function slots? Since slot->number holds the full 8-bit devfn for per-function slots, will a device with devfn 0x81 (Device 16, Function 1) be emitted into the third field as 0000:00:81? Userspace tools expecting a standard 5-bit device number might fail to parse this properly, violating the domain:bus:device UAPI format. > @@ -72,6 +72,23 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf) > return bus_speed_read(slot->bus->cur_bus_speed, buf); > } > > +static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot) > +{ > + if (slot->per_func_slot) > + return dev->devfn == slot->number; > + > + return slot->number == PCI_SLOT_ALL_DEVICES || > + PCI_SLOT(dev->devfn) == slot->number; > +} > + > +static bool pci_slot_enabled_per_func(void) > +{ > + if (IS_ENABLED(CONFIG_S390)) > + return true; > + > + return false; > +} [Severity: High] Will unconditionally enabling per_func_slot for all slots on s390 break generic PCI hotplug drivers like pciehp? Generic drivers create slots using PCI_SLOT_ALL_DEVICES. Because devfn can never equal 0xfeff, will matches against PCI_SLOT_ALL_DEVICES always fail? Additionally, if a match does occur, could this allow a destructive PCIe Hot Reset of an entire multifunction device, since pci_dev_reset_slot_function() now bypasses the dev->multifunction safeguard when per_func_slot is true? > static void pci_slot_release(struct kobject *kobj) > { > struct pci_dev *dev; > @@ -82,8 +99,7 @@ static void pci_slot_release(struct kobject *kobj) > > down_read(&pci_bus_sem); > list_for_each_entry(dev, &slot->bus->devices, bus_list) > - if (slot->number == PCI_SLOT_ALL_DEVICES || > - PCI_SLOT(dev->devfn) == slot->number) > + if (pci_dev_matches_slot(dev, slot)) > dev->slot = NULL; > up_read(&pci_bus_sem); [Severity: High] This is a pre-existing issue, but does pci_slot_release() modify the dev->bus->slots list without holding pci_slot_mutex? If userspace holds an active reference to the slot via sysfs, this function executes asynchronously when the sysfs file is closed. Calling list_del(&slot->list) without the mutex might cause list corruption if other paths like pci_dev_assign_slot() are concurrently iterating the list. [Severity: High] This isn't a bug introduced by this patch, but does the lack of mutual exclusion between pci_dev_assign_slot() and pci_slot_release() cause a data race on dev->slot? pci_dev_assign_slot() sets dev->slot = slot under pci_slot_mutex, while pci_slot_release() sets dev->slot = NULL under a read lock on pci_bus_sem. Could pci_dev_assign_slot() discover a slot that is mid-release and assign it to dev->slot right before the memory is freed, creating a dangling pointer? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260622171840.1618-1-alifm@linux.ibm.com?part=1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible 2026-06-22 17:18 [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x Farhan Ali 2026-06-22 17:18 ` [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 Farhan Ali @ 2026-06-22 17:18 ` Farhan Ali 2026-06-22 17:29 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible Farhan Ali 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali 3 siblings, 1 reply; 13+ messages in thread From: Farhan Ali @ 2026-06-22 17:18 UTC (permalink / raw) To: linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, alifm, schnelle, mjrosato, Bjorn Helgaas The current reset process saves the device's config space state before reset and restores it afterward. However errors may occur unexpectedly and it may then be impossible to save config space because the device may be inaccessible (e.g. DPC). This results in saving invalid values that get written back to the device during state restoration. With a reset we want to recover/restore the device into a functional state. So avoid saving the state of the config space when the device config space is inaccessible. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> --- drivers/pci/pci.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f5f8291482b0..973d23e41c48 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -722,6 +722,27 @@ u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec) } EXPORT_SYMBOL_GPL(pci_find_dvsec_capability); +static bool pci_dev_config_accessible(struct pci_dev *dev, char *msg) +{ + u32 val; + + /* + * If device's config space is inaccessible it can return ~0 for + * any reads. Since VFs can also return ~0 for Device and Vendor ID + * check Command and Status registers. Note that this is racy + * because the device may become inaccessible partway through + * next access. + */ + pci_read_config_dword(dev, PCI_COMMAND, &val); + if (PCI_POSSIBLE_ERROR(val)) { + pci_warn(dev, "Device config space inaccessible; unable to %s\n", + msg); + return false; + } + + return true; +} + /** * pci_find_parent_resource - return resource region of parent bus of given * region @@ -5027,6 +5048,9 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) */ pci_set_power_state(dev, PCI_D0); + if (!pci_dev_config_accessible(dev, "save state")) + return; + pci_save_state(dev); /* * Disable the device by clearing the Command register, except for -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible 2026-06-22 17:18 ` [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible Farhan Ali @ 2026-06-22 17:29 ` sashiko-bot 0 siblings, 0 replies; 13+ messages in thread From: sashiko-bot @ 2026-06-22 17:29 UTC (permalink / raw) To: Farhan Ali Cc: Heiko Carstens, linux-s390, linux-pci, Vasily Gorbik, Alexander Gordeev Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [Low] The `msg` parameter in `pci_dev_config_accessible` is declared as `char *` but is passed string literals. -- commit 9495eb9f8f7d4d9784f1bb256b565bf60e0d2bb7 Author: Farhan Ali <alifm@linux.ibm.com> PCI: Avoid saving config space state if inaccessible This commit modifies the device reset process to check if the device's configuration space is accessible before attempting to save its state. If the space is inaccessible, it skips saving to prevent invalid values from being written back during state restoration. > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index d0c9f0166af5c..e71b814153921 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c [ ... ] > @@ -722,6 +722,27 @@ u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec) > } > EXPORT_SYMBOL_GPL(pci_find_dvsec_capability); > > +static bool pci_dev_config_accessible(struct pci_dev *dev, char *msg) [Severity: Low] This isn't a bug, but could the msg parameter be declared as const char * instead? This function is passed string literals like "save state" when called from pci_dev_save_and_disable(), and using a const pointer would ensure const-correctness and help prevent potential compiler warnings. > +{ > + u32 val; > + > + /* > + * If device's config space is inaccessible it can return ~0 for > + * any reads. Since VFs can also return ~0 for Device and Vendor ID > + * check Command and Status registers. Note that this is racy > + * because the device may become inaccessible partway through > + * next access. > + */ > + pci_read_config_dword(dev, PCI_COMMAND, &val); > + if (PCI_POSSIBLE_ERROR(val)) { > + pci_warn(dev, "Device config space inaccessible; unable to %s\n", > + msg); > + return false; > + } > + > + return true; > +} -- Sashiko AI review · https://sashiko.dev/#/patchset/20260622171840.1618-1-alifm@linux.ibm.com?part=2 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible 2026-06-22 17:18 [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x Farhan Ali 2026-06-22 17:18 ` [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 Farhan Ali 2026-06-22 17:18 ` [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible Farhan Ali @ 2026-06-22 17:18 ` Farhan Ali 2026-06-22 17:27 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali 3 siblings, 1 reply; 13+ messages in thread From: Farhan Ali @ 2026-06-22 17:18 UTC (permalink / raw) To: linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, alifm, schnelle, mjrosato, Benjamin Block If a device is in an error state, then it's config space may not be accssible. Add additional check to validate if a device's config space is accessible before doing an FLR reset. Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> --- drivers/pci/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 973d23e41c48..521a4bb189d7 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4352,6 +4352,9 @@ int pcie_flr(struct pci_dev *dev) { int ret; + if (!pci_dev_config_accessible(dev, "FLR")) + return -ENOTTY; + if (!pci_wait_for_pending_transaction(dev)) pci_err(dev, "timed out waiting for pending transaction; performing function level reset anyway\n"); -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible 2026-06-22 17:18 ` [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible Farhan Ali @ 2026-06-22 17:27 ` sashiko-bot 0 siblings, 0 replies; 13+ messages in thread From: sashiko-bot @ 2026-06-22 17:27 UTC (permalink / raw) To: Farhan Ali Cc: Alexander Gordeev, Heiko Carstens, Vasily Gorbik, linux-s390, linux-pci > If a device is in an error state, then it's config space may not be > accssible. Add additional check to validate if a device's config space is > accessible before doing an FLR reset. > > Reviewed-by: Benjamin Block <bblock@linux.ibm.com> > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Sashiko has reviewed this patch and found no issues. It looks great! -- Sashiko AI review · https://sashiko.dev/#/patchset/20260622171840.1618-1-alifm@linux.ibm.com?part=3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 17:18 [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x Farhan Ali ` (2 preceding siblings ...) 2026-06-22 17:18 ` [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible Farhan Ali @ 2026-06-22 17:18 ` Farhan Ali 2026-06-22 17:30 ` sashiko-bot ` (2 more replies) 3 siblings, 3 replies; 13+ messages in thread From: Farhan Ali @ 2026-06-22 17:18 UTC (permalink / raw) To: linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, alifm, schnelle, mjrosato The current MSI-X restoration path assumes the Command register Memory bit is enabled when writing MSI-X messages. But its possible the last saved and restored state of device may not have the Memory bit enabled, even if a device driver later enables Memory bit and MSI-X. Attempting to access Memory space without Memory bit enabled can lead to Unsupported Request (UR) from the device. Fix this by enabling Memory bit and restore it afterwards. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> --- drivers/pci/msi/msi.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c index 81d24a270a79..46a0d9f68a57 100644 --- a/drivers/pci/msi/msi.c +++ b/drivers/pci/msi/msi.c @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) { struct msi_desc *entry; bool write_msg; + u16 cmd; if (!dev->msix_enabled) return; @@ -882,6 +883,8 @@ void __pci_restore_msix_state(struct pci_dev *dev) pci_intx_for_msi(dev, 0); pci_msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); + pci_read_config_word(dev, PCI_COMMAND, &cmd); + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); write_msg = arch_restore_msi_irqs(dev); @@ -893,6 +896,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) } } + pci_write_config_word(dev, PCI_COMMAND, cmd); pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0); } -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali @ 2026-06-22 17:30 ` sashiko-bot 2026-06-22 18:54 ` Niklas Schnelle 2026-06-22 20:29 ` Thomas Gleixner 2 siblings, 0 replies; 13+ messages in thread From: sashiko-bot @ 2026-06-22 17:30 UTC (permalink / raw) To: Farhan Ali Cc: Alexander Gordeev, Vasily Gorbik, linux-s390, linux-pci, Heiko Carstens > The current MSI-X restoration path assumes the Command register Memory bit > is enabled when writing MSI-X messages. But its possible the last saved and > restored state of device may not have the Memory bit enabled, even if a > device driver later enables Memory bit and MSI-X. Attempting to access > Memory space without Memory bit enabled can lead to Unsupported Request > (UR) from the device. Fix this by enabling Memory bit and restore > it afterwards. > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Sashiko has reviewed this patch and found no issues. It looks great! -- Sashiko AI review · https://sashiko.dev/#/patchset/20260622171840.1618-1-alifm@linux.ibm.com?part=4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali 2026-06-22 17:30 ` sashiko-bot @ 2026-06-22 18:54 ` Niklas Schnelle 2026-06-22 20:22 ` Farhan Ali 2026-06-22 20:29 ` Thomas Gleixner 2 siblings, 1 reply; 13+ messages in thread From: Niklas Schnelle @ 2026-06-22 18:54 UTC (permalink / raw) To: Farhan Ali, linux-s390, linux-kernel, linux-pci; +Cc: helgaas, alex, mjrosato On Mon, 2026-06-22 at 10:18 -0700, Farhan Ali wrote: > The current MSI-X restoration path assumes the Command register Memory bit > is enabled when writing MSI-X messages. But its possible the last saved and SpellingL "it's" > restored state of device may not have the Memory bit enabled, even if a Missing "a" in "of a device" > device driver later enables Memory bit and MSI-X. Attempting to access > Memory space without Memory bit enabled can lead to Unsupported Request > (UR) from the device. Fix this by enabling Memory bit and restore > it afterwards. The official spec term is "Memory Space Enable" and I think "can lead to" is actually underselling it, as the PCIe spec says that accesses are "caused to be handled as Unsupported Requests". Also as I understand you did see this in practice. Maybe we even want a Fixes tag? > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > --- > drivers/pci/msi/msi.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c > index 81d24a270a79..46a0d9f68a57 100644 > --- a/drivers/pci/msi/msi.c > +++ b/drivers/pci/msi/msi.c > @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) > { > struct msi_desc *entry; > bool write_msg; > + u16 cmd; > > if (!dev->msix_enabled) > return; > @@ -882,6 +883,8 @@ void __pci_restore_msix_state(struct pci_dev *dev) > pci_intx_for_msi(dev, 0); > pci_msix_clear_and_set_ctrl(dev, 0, > PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); > > write_msg = arch_restore_msi_irqs(dev); > > @@ -893,6 +896,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) > } > } > > + pci_write_config_word(dev, PCI_COMMAND, cmd); > pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0); > } > Code wise this all looks good to me so feel free to add. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Thanks, Niklas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 18:54 ` Niklas Schnelle @ 2026-06-22 20:22 ` Farhan Ali 0 siblings, 0 replies; 13+ messages in thread From: Farhan Ali @ 2026-06-22 20:22 UTC (permalink / raw) To: Niklas Schnelle, linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, mjrosato On 6/22/2026 11:54 AM, Niklas Schnelle wrote: > On Mon, 2026-06-22 at 10:18 -0700, Farhan Ali wrote: >> The current MSI-X restoration path assumes the Command register Memory bit >> is enabled when writing MSI-X messages. But its possible the last saved and > SpellingL "it's" > >> restored state of device may not have the Memory bit enabled, even if a > Missing "a" in "of a device" Ah I missed correcting these. > >> device driver later enables Memory bit and MSI-X. Attempting to access >> Memory space without Memory bit enabled can lead to Unsupported Request >> (UR) from the device. Fix this by enabling Memory bit and restore >> it afterwards. > The official spec term is "Memory Space Enable" and I think "can lead > to" is actually underselling it, as the PCIe spec says that accesses > are "caused to be handled as Unsupported Requests". Also as I > understand you did see this in practice. Maybe we even want a Fixes > tag? This behavior has been present since 41017f0cac92 ("[PATCH] PCI: MSI(X) save/restore for suspend/resume"), so I am not sure a fixes tag makes sense? >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >> --- >> drivers/pci/msi/msi.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c >> index 81d24a270a79..46a0d9f68a57 100644 >> --- a/drivers/pci/msi/msi.c >> +++ b/drivers/pci/msi/msi.c >> @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) >> { >> struct msi_desc *entry; >> bool write_msg; >> + u16 cmd; >> >> if (!dev->msix_enabled) >> return; >> @@ -882,6 +883,8 @@ void __pci_restore_msix_state(struct pci_dev *dev) >> pci_intx_for_msi(dev, 0); >> pci_msix_clear_and_set_ctrl(dev, 0, >> PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); >> + pci_read_config_word(dev, PCI_COMMAND, &cmd); >> + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); >> >> write_msg = arch_restore_msi_irqs(dev); >> >> @@ -893,6 +896,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) >> } >> } >> >> + pci_write_config_word(dev, PCI_COMMAND, cmd); >> pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0); >> } >> > Code wise this all looks good to me so feel free to add. > > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> > > Thanks, > Niklas Thanks for reviewing! Thanks Farhan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali 2026-06-22 17:30 ` sashiko-bot 2026-06-22 18:54 ` Niklas Schnelle @ 2026-06-22 20:29 ` Thomas Gleixner 2026-06-22 20:49 ` Farhan Ali 2 siblings, 1 reply; 13+ messages in thread From: Thomas Gleixner @ 2026-06-22 20:29 UTC (permalink / raw) To: Farhan Ali, linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, alifm, schnelle, mjrosato On Mon, Jun 22 2026 at 10:18, Farhan Ali wrote: > The current MSI-X restoration path assumes the Command register Memory bit > is enabled when writing MSI-X messages. But its possible the last saved and > restored state of device may not have the Memory bit enabled, even if a > device driver later enables Memory bit and MSI-X. Attempting to access > Memory space without Memory bit enabled can lead to Unsupported Request > (UR) from the device. Fix this by enabling Memory bit and restore > it afterwards. > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > --- > drivers/pci/msi/msi.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c > index 81d24a270a79..46a0d9f68a57 100644 > --- a/drivers/pci/msi/msi.c > +++ b/drivers/pci/msi/msi.c > @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) > { > struct msi_desc *entry; > bool write_msg; > + u16 cmd; > > if (!dev->msix_enabled) > return; > @@ -882,6 +883,8 @@ void __pci_restore_msix_state(struct pci_dev *dev) > pci_intx_for_msi(dev, 0); > pci_msix_clear_and_set_ctrl(dev, 0, > PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); Can we please have a comment there which explains this? Three month down the road this will results in head scratching otherwise. I agree with Niklas that this wants a Fixes and a Cc:stable tag. Other than that: Reviewed-by: Thomas Gleixner <tglx@kernel.org> Thanks, tglx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages 2026-06-22 20:29 ` Thomas Gleixner @ 2026-06-22 20:49 ` Farhan Ali 0 siblings, 0 replies; 13+ messages in thread From: Farhan Ali @ 2026-06-22 20:49 UTC (permalink / raw) To: Thomas Gleixner, linux-s390, linux-kernel, linux-pci Cc: helgaas, alex, schnelle, mjrosato On 6/22/2026 1:29 PM, Thomas Gleixner wrote: > On Mon, Jun 22 2026 at 10:18, Farhan Ali wrote: >> The current MSI-X restoration path assumes the Command register Memory bit >> is enabled when writing MSI-X messages. But its possible the last saved and >> restored state of device may not have the Memory bit enabled, even if a >> device driver later enables Memory bit and MSI-X. Attempting to access >> Memory space without Memory bit enabled can lead to Unsupported Request >> (UR) from the device. Fix this by enabling Memory bit and restore >> it afterwards. >> >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >> --- >> drivers/pci/msi/msi.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c >> index 81d24a270a79..46a0d9f68a57 100644 >> --- a/drivers/pci/msi/msi.c >> +++ b/drivers/pci/msi/msi.c >> @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev) >> { >> struct msi_desc *entry; >> bool write_msg; >> + u16 cmd; >> >> if (!dev->msix_enabled) >> return; >> @@ -882,6 +883,8 @@ void __pci_restore_msix_state(struct pci_dev *dev) >> pci_intx_for_msi(dev, 0); >> pci_msix_clear_and_set_ctrl(dev, 0, >> PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); >> + pci_read_config_word(dev, PCI_COMMAND, &cmd); >> + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); > Can we please have a comment there which explains this? Three month down > the road this will results in head scratching otherwise. > > I agree with Niklas that this wants a Fixes and a Cc:stable tag. > > Other than that: > > Reviewed-by: Thomas Gleixner <tglx@kernel.org> > > Thanks, > > tglx > I can add a comment, how about something like below? "The restored device state may not have Memory decoding enabled in Command register. Since the MSI-X was enabled for the device, enable Memory decoding before restoring MSI-X" For the Fixes tag, do you have a suggestion for a commit? This behavior has been present since 41017f0cac92 ("[PATCH] PCI: MSI(X) save/restore for suspend/resume" which introduced these restore functions. So should be Fixes against 41017f0cac92? Thanks Farhan ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-06-22 20:49 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-22 17:18 [PATCH v20 0/4] [PCI] Error recovery for vfio-pci devices on s390x Farhan Ali 2026-06-22 17:18 ` [PATCH v20 1/4] PCI: Allow per function PCI slots to fix slot reset on s390 Farhan Ali 2026-06-22 17:32 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 2/4] PCI: Avoid saving config space state if inaccessible Farhan Ali 2026-06-22 17:29 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 3/4] PCI: Fail FLR when config space is inaccessible Farhan Ali 2026-06-22 17:27 ` sashiko-bot 2026-06-22 17:18 ` [PATCH v20 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages Farhan Ali 2026-06-22 17:30 ` sashiko-bot 2026-06-22 18:54 ` Niklas Schnelle 2026-06-22 20:22 ` Farhan Ali 2026-06-22 20:29 ` Thomas Gleixner 2026-06-22 20:49 ` Farhan Ali
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox