linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x
@ 2025-08-25 17:12 Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 1/9] PCI: Avoid restoring error values in config space Farhan Ali
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

Hi,

This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x). 

Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace.  But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event. 

Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information 
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving 
a device reset.

I would appreciate some feedback on this series.

Thanks
Farhan

ChangeLog
---------
v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
   - Patches 1 and 2 adds some additional checks for FLR/PM reset to 
     try other function reset method (suggested by Alex).

   - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
     functions.

   - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE 
     ioctl. The ioctl is used by userspace to retriece any PCI error
     information for the device (suggested by Alex).

   - Patch 8 adds a reset_done() callback for the vfio-pci driver, to
     restore the state of the device after a reset.

   - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.

Farhan Ali (9):
  PCI: Avoid restoring error values in config space
  PCI: Add additional checks for flr and pm reset
  PCI: Allow per function PCI slots for hypervisor isolated functions
  s390/pci: Restore airq unconditionally for the zPCI device
  s390/pci: Update the logic for detecting passthrough device
  s390/pci: Store PCI error information for passthrough devices
  vfio-pci/zdev: Add a device feature for error information
  vfio: Add a reset_done callback for vfio-pci driver
  vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX

 arch/s390/include/asm/pci.h       |  30 ++++++++-
 arch/s390/pci/pci.c               |   1 +
 arch/s390/pci/pci_event.c         | 107 +++++++++++++++++-------------
 arch/s390/pci/pci_irq.c           |   9 +--
 drivers/pci/pci.c                 |  10 +++
 drivers/pci/slot.c                |  19 +++++-
 drivers/vfio/pci/vfio_pci_core.c  |  20 ++++--
 drivers/vfio/pci/vfio_pci_intrs.c |   3 +-
 drivers/vfio/pci/vfio_pci_priv.h  |   8 +++
 drivers/vfio/pci/vfio_pci_zdev.c  |  45 ++++++++++++-
 include/uapi/linux/vfio.h         |  14 ++++
 11 files changed, 200 insertions(+), 66 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/9] PCI: Avoid restoring error values in config space
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 21:35   ` Alex Williamson
  2025-08-25 17:12 ` [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset Farhan Ali
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

The current reset process saves the device's config space state before
reset and restores it afterward. However, when a device is in an error
state before reset, config space reads may return error values instead of
valid data. This results in saving corrupted values that get written back
to the device during state restoration. Add validation to prevent writing
error values to the device when restoring the config space state after
reset.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/pci/pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cd..0dd95d782022 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
 	if (!force && val == saved_val)
 		return;
 
+	if (PCI_POSSIBLE_ERROR(saved_val))
+		return;
+
 	for (;;) {
 		pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
 			offset, val, saved_val);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 1/9] PCI: Avoid restoring error values in config space Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 21:54   ` Alex Williamson
  2025-08-25 17:12 ` [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions Farhan Ali
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

If a device is in an error state, then any reads of device registers can
return error value. Add addtional checks to validate if a device is in an
error state before doing an flr or pm reset.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/pci/pci.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0dd95d782022..a07bdb287cf3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4560,12 +4560,17 @@ EXPORT_SYMBOL_GPL(pcie_flr);
  */
 int pcie_reset_flr(struct pci_dev *dev, bool probe)
 {
+	u32 reg;
+
 	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
 		return -ENOTTY;
 
 	if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
 		return -ENOTTY;
 
+	if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &reg))
+		return -ENOTTY;
+
 	if (probe)
 		return 0;
 
@@ -4640,6 +4645,8 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
 		return -ENOTTY;
 
 	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
+	if (PCI_POSSIBLE_ERROR(csr))
+		return -ENOTTY;
 	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
 		return -ENOTTY;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 1/9] PCI: Avoid restoring error values in config space Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-27  7:50   ` Niklas Schnelle
  2025-08-25 17:12 ` [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device Farhan Ali
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

On s390 systems, which use a machine level hypervisor, PCI devices are
always accessed through a form of PCI pass-through which fundamentally
operates on a per PCI function granularity. This is also reflected in the
s390 PCI hotplug driver which creates hotplug slots for individual PCI
functions. Its reset_slot() function, which is a wrapper for
zpci_hot_reset_device(), thus also resets individual functions.

Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
to multifunction devices. This approach worked fine on s390 systems that
only exposed virtual functions as individual PCI domains to the operating
system.  Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
s390 supports exposing the topology of multifunction PCI devices by
grouping them in a shared PCI domain. When attempting to reset a function
through the hotplug driver, the shared slot assignment causes the wrong
function to be reset instead of the intended one. It also leaks memory as
we do create a pci_slot object for the function, but don't correctly free
it in pci_slot_release().

This patch adds a helper function to allow per function PCI slots for
functions managed through a hypervisor which exposes individual PCI
functions while retaining the topology.

Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
Co-developed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/pci/slot.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index 50fb3eb595fe..991526af0ffe 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -5,6 +5,7 @@
  *	Alex Chiang <achiang@hp.com>
  */
 
+#include <linux/hypervisor.h>
 #include <linux/kobject.h>
 #include <linux/slab.h>
 #include <linux/pci.h>
@@ -73,7 +74,7 @@ static void pci_slot_release(struct kobject *kobj)
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &slot->bus->devices, bus_list)
-		if (PCI_SLOT(dev->devfn) == slot->number)
+		if (dev->slot == slot->number)
 			dev->slot = NULL;
 	up_read(&pci_bus_sem);
 
@@ -160,13 +161,25 @@ static int rename_slot(struct pci_slot *slot, const char *name)
 	return result;
 }
 
+static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
+{
+	if (hypervisor_isolated_pci_functions()) {
+		if (dev->devfn == slot->number)
+			return true;
+	} else {
+		if (PCI_SLOT(dev->devfn) == slot->number)
+			return true;
+	}
+	return false;
+}
+
 void pci_dev_assign_slot(struct pci_dev *dev)
 {
 	struct pci_slot *slot;
 
 	mutex_lock(&pci_slot_mutex);
 	list_for_each_entry(slot, &dev->bus->slots, list)
-		if (PCI_SLOT(dev->devfn) == slot->number)
+		if (pci_dev_matches_slot(dev, slot))
 			dev->slot = slot;
 	mutex_unlock(&pci_slot_mutex);
 }
@@ -285,7 +298,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &parent->devices, bus_list)
-		if (PCI_SLOT(dev->devfn) == slot_nr)
+		if (pci_dev_matches_slot(dev, slot))
 			dev->slot = slot;
 	up_read(&pci_bus_sem);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (2 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-27 13:27   ` Niklas Schnelle
  2025-08-25 17:12 ` [PATCH v2 5/9] s390/pci: Update the logic for detecting passthrough device Farhan Ali
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

Commit c1e18c17bda6 ("s390/pci: add zpci_set_irq()/zpci_clear_irq()"),
introduced the zpci_set_irq() and zpci_clear_irq(), to be used while
resetting a zPCI device.

Commit da995d538d3a ("s390/pci: implement reset_slot for hotplug slot"),
mentions zpci_clear_irq() being called in the path for zpci_hot_reset_device().
But that is not the case anymore and these functions are not called
outside of this file.

However after a CLP disable/enable reset (zpci_hot_reset_device),the airq
setup of the device will need to be restored. Since we are no longer
calling zpci_clear_airq() in the reset path, we should restore the airq for
device unconditionally.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 arch/s390/include/asm/pci.h | 1 -
 arch/s390/pci/pci_irq.c     | 9 +--------
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 41f900f693d9..aed19a1aa9d7 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -145,7 +145,6 @@ struct zpci_dev {
 	u8		has_resources	: 1;
 	u8		is_physfn	: 1;
 	u8		util_str_avail	: 1;
-	u8		irqs_registered	: 1;
 	u8		tid_avail	: 1;
 	u8		rtr_avail	: 1; /* Relaxed translation allowed */
 	unsigned int	devfn;		/* DEVFN part of the RID*/
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 84482a921332..e73be96ce5fe 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -107,9 +107,6 @@ static int zpci_set_irq(struct zpci_dev *zdev)
 	else
 		rc = zpci_set_airq(zdev);
 
-	if (!rc)
-		zdev->irqs_registered = 1;
-
 	return rc;
 }
 
@@ -123,9 +120,6 @@ static int zpci_clear_irq(struct zpci_dev *zdev)
 	else
 		rc = zpci_clear_airq(zdev);
 
-	if (!rc)
-		zdev->irqs_registered = 0;
-
 	return rc;
 }
 
@@ -427,8 +421,7 @@ bool arch_restore_msi_irqs(struct pci_dev *pdev)
 {
 	struct zpci_dev *zdev = to_zpci(pdev);
 
-	if (!zdev->irqs_registered)
-		zpci_set_irq(zdev);
+	zpci_set_irq(zdev);
 	return true;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/9] s390/pci: Update the logic for detecting passthrough device
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (3 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 6/9] s390/pci: Store PCI error information for passthrough devices Farhan Ali
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

We can now have userspace drivers (vfio-pci based) on s390x. The userspace
drivers will not have any KVM fd and so no kzdev associated with them. So
we need to update the logic for detecting passthrough devices to not depend
on struct kvm_zdev.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 arch/s390/include/asm/pci.h      |  1 +
 arch/s390/pci/pci_event.c        | 14 ++++----------
 drivers/vfio/pci/vfio_pci_zdev.c |  9 ++++++++-
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index aed19a1aa9d7..f47f62fc3bfd 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -169,6 +169,7 @@ struct zpci_dev {
 
 	char res_name[16];
 	bool mio_capable;
+	bool mediated_recovery;
 	struct zpci_bar_struct bars[PCI_STD_NUM_BARS];
 
 	u64		start_dma;	/* Start of available DMA addresses */
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index d930416d4c90..541d536be052 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -61,16 +61,10 @@ static inline bool ers_result_indicates_abort(pci_ers_result_t ers_res)
 	}
 }
 
-static bool is_passed_through(struct pci_dev *pdev)
+static bool needs_mediated_recovery(struct pci_dev *pdev)
 {
 	struct zpci_dev *zdev = to_zpci(pdev);
-	bool ret;
-
-	mutex_lock(&zdev->kzdev_lock);
-	ret = !!zdev->kzdev;
-	mutex_unlock(&zdev->kzdev_lock);
-
-	return ret;
+	return zdev->mediated_recovery;
 }
 
 static bool is_driver_supported(struct pci_driver *driver)
@@ -194,7 +188,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
 	}
 	pdev->error_state = pci_channel_io_frozen;
 
-	if (is_passed_through(pdev)) {
+	if (needs_mediated_recovery(pdev)) {
 		pr_info("%s: Cannot be recovered in the host because it is a pass-through device\n",
 			pci_name(pdev));
 		status_str = "failed (pass-through)";
@@ -277,7 +271,7 @@ static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es)
 	 * we will inject the error event and let the guest recover the device
 	 * itself.
 	 */
-	if (is_passed_through(pdev))
+	if (needs_mediated_recovery(pdev))
 		goto out;
 	driver = to_pci_driver(pdev->dev.driver);
 	if (driver && driver->err_handler && driver->err_handler->error_detected)
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 0990fdb146b7..a7bc23ce8483 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -148,6 +148,8 @@ int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev)
 	if (!zdev)
 		return -ENODEV;
 
+	zdev->mediated_recovery = true;
+
 	if (!vdev->vdev.kvm)
 		return 0;
 
@@ -161,7 +163,12 @@ void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev)
 {
 	struct zpci_dev *zdev = to_zpci(vdev->pdev);
 
-	if (!zdev || !vdev->vdev.kvm)
+	if (!zdev)
+		return;
+
+	zdev->mediated_recovery = false;
+
+	if (!vdev->vdev.kvm)
 		return;
 
 	if (zpci_kvm_hook.kvm_unregister)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/9] s390/pci: Store PCI error information for passthrough devices
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (4 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 5/9] s390/pci: Update the logic for detecting passthrough device Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 7/9] vfio-pci/zdev: Add a device feature for error information Farhan Ali
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

For a passthrough device we need co-operation from user space to recover
the device. This would require to bubble up any error information to user
space.  Let's store this error information for passthrough devices, so it
can be retrieved later.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 arch/s390/include/asm/pci.h      | 28 ++++++++++
 arch/s390/pci/pci.c              |  1 +
 arch/s390/pci/pci_event.c        | 95 +++++++++++++++++++-------------
 drivers/vfio/pci/vfio_pci_zdev.c |  2 +
 4 files changed, 88 insertions(+), 38 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index f47f62fc3bfd..72e05af90e08 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -116,6 +116,31 @@ struct zpci_bus {
 	enum pci_bus_speed	max_bus_speed;
 };
 
+/* Content Code Description for PCI Function Error */
+struct zpci_ccdf_err {
+	u32 reserved1;
+	u32 fh;                         /* function handle */
+	u32 fid;                        /* function id */
+	u32 ett         :  4;           /* expected table type */
+	u32 mvn         : 12;           /* MSI vector number */
+	u32 dmaas       :  8;           /* DMA address space */
+	u32 reserved2   :  6;
+	u32 q           :  1;           /* event qualifier */
+	u32 rw          :  1;           /* read/write */
+	u64 faddr;                      /* failing address */
+	u32 reserved3;
+	u16 reserved4;
+	u16 pec;                        /* PCI event code */
+} __packed;
+
+#define ZPCI_ERR_PENDING_MAX 16
+struct zpci_ccdf_pending {
+	size_t count;
+	int head;
+	int tail;
+	struct zpci_ccdf_err err[ZPCI_ERR_PENDING_MAX];
+};
+
 /* Private data per function */
 struct zpci_dev {
 	struct zpci_bus *zbus;
@@ -191,6 +216,8 @@ struct zpci_dev {
 	struct iommu_domain *s390_domain; /* attached IOMMU domain */
 	struct kvm_zdev *kzdev;
 	struct mutex kzdev_lock;
+	struct zpci_ccdf_pending pending_errs;
+	struct mutex pending_errs_lock;
 	spinlock_t dom_lock;		/* protect s390_domain change */
 };
 
@@ -316,6 +343,7 @@ void zpci_debug_exit_device(struct zpci_dev *);
 int zpci_report_error(struct pci_dev *, struct zpci_report_error_header *);
 int zpci_clear_error_state(struct zpci_dev *zdev);
 int zpci_reset_load_store_blocked(struct zpci_dev *zdev);
+void zpci_cleanup_pending_errors(struct zpci_dev *zdev);
 
 #ifdef CONFIG_NUMA
 
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index cd6676c2d602..f795e05b5001 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -823,6 +823,7 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, enum zpci_state state)
 	mutex_init(&zdev->state_lock);
 	mutex_init(&zdev->fmb_lock);
 	mutex_init(&zdev->kzdev_lock);
+	mutex_init(&zdev->pending_errs_lock);
 
 	return zdev;
 
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index 541d536be052..ac527410812b 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -18,23 +18,6 @@
 #include "pci_bus.h"
 #include "pci_report.h"
 
-/* Content Code Description for PCI Function Error */
-struct zpci_ccdf_err {
-	u32 reserved1;
-	u32 fh;				/* function handle */
-	u32 fid;			/* function id */
-	u32 ett		:  4;		/* expected table type */
-	u32 mvn		: 12;		/* MSI vector number */
-	u32 dmaas	:  8;		/* DMA address space */
-	u32		:  6;
-	u32 q		:  1;		/* event qualifier */
-	u32 rw		:  1;		/* read/write */
-	u64 faddr;			/* failing address */
-	u32 reserved3;
-	u16 reserved4;
-	u16 pec;			/* PCI event code */
-} __packed;
-
 /* Content Code Description for PCI Function Availability */
 struct zpci_ccdf_avail {
 	u32 reserved1;
@@ -76,6 +59,41 @@ static bool is_driver_supported(struct pci_driver *driver)
 	return true;
 }
 
+static void zpci_store_pci_error(struct pci_dev *pdev,
+				 struct zpci_ccdf_err *ccdf)
+{
+	struct zpci_dev *zdev = to_zpci(pdev);
+	int i;
+
+	mutex_lock(&zdev->pending_errs_lock);
+	if (zdev->pending_errs.count >= ZPCI_ERR_PENDING_MAX) {
+		pr_err("%s: Cannot store PCI error info for device",
+				pci_name(pdev));
+		mutex_unlock(&zdev->pending_errs_lock);
+		return;
+	}
+
+	i = zdev->pending_errs.tail % ZPCI_ERR_PENDING_MAX;
+	memcpy(&zdev->pending_errs.err[i], ccdf, sizeof(struct zpci_ccdf_err));
+	zdev->pending_errs.tail++;
+	zdev->pending_errs.count++;
+	mutex_unlock(&zdev->pending_errs_lock);
+}
+
+void zpci_cleanup_pending_errors(struct zpci_dev *zdev)
+{
+	struct pci_dev *pdev = NULL;
+
+	mutex_lock(&zdev->pending_errs_lock);
+	pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn);
+	if (zdev->pending_errs.count)
+		pr_err("%s: Unhandled PCI error events count=%zu",
+				pci_name(pdev), zdev->pending_errs.count);
+	memset(&zdev->pending_errs, 0, sizeof(struct zpci_ccdf_pending));
+	mutex_unlock(&zdev->pending_errs_lock);
+}
+EXPORT_SYMBOL_GPL(zpci_cleanup_pending_errors);
+
 static pci_ers_result_t zpci_event_notify_error_detected(struct pci_dev *pdev,
 							 struct pci_driver *driver)
 {
@@ -169,7 +187,8 @@ static pci_ers_result_t zpci_event_do_reset(struct pci_dev *pdev,
  * and the platform determines which functions are affected for
  * multi-function devices.
  */
-static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
+static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev,
+							  struct zpci_ccdf_err *ccdf)
 {
 	pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
 	struct zpci_dev *zdev = to_zpci(pdev);
@@ -188,13 +207,6 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
 	}
 	pdev->error_state = pci_channel_io_frozen;
 
-	if (needs_mediated_recovery(pdev)) {
-		pr_info("%s: Cannot be recovered in the host because it is a pass-through device\n",
-			pci_name(pdev));
-		status_str = "failed (pass-through)";
-		goto out_unlock;
-	}
-
 	driver = to_pci_driver(pdev->dev.driver);
 	if (!is_driver_supported(driver)) {
 		if (!driver) {
@@ -210,12 +222,22 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
 		goto out_unlock;
 	}
 
+	if (needs_mediated_recovery(pdev))
+		zpci_store_pci_error(pdev, ccdf);
+
 	ers_res = zpci_event_notify_error_detected(pdev, driver);
 	if (ers_result_indicates_abort(ers_res)) {
 		status_str = "failed (abort on detection)";
 		goto out_unlock;
 	}
 
+	if (needs_mediated_recovery(pdev)) {
+		pr_info("%s: Recovering passthrough device\n", pci_name(pdev));
+		ers_res = PCI_ERS_RESULT_RECOVERED;
+		status_str = "in progress";
+		goto out_unlock;
+	}
+
 	if (ers_res != PCI_ERS_RESULT_NEED_RESET) {
 		ers_res = zpci_event_do_error_state_clear(pdev, driver);
 		if (ers_result_indicates_abort(ers_res)) {
@@ -258,25 +280,20 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
  * @pdev: PCI function for which to report
  * @es: PCI channel failure state to report
  */
-static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es)
+static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es,
+				  struct zpci_ccdf_err *ccdf)
 {
 	struct pci_driver *driver;
 
 	pci_dev_lock(pdev);
 	pdev->error_state = es;
-	/**
-	 * While vfio-pci's error_detected callback notifies user-space QEMU
-	 * reacts to this by freezing the guest. In an s390 environment PCI
-	 * errors are rarely fatal so this is overkill. Instead in the future
-	 * we will inject the error event and let the guest recover the device
-	 * itself.
-	 */
+
 	if (needs_mediated_recovery(pdev))
-		goto out;
+		zpci_store_pci_error(pdev, ccdf);
 	driver = to_pci_driver(pdev->dev.driver);
 	if (driver && driver->err_handler && driver->err_handler->error_detected)
 		driver->err_handler->error_detected(pdev, pdev->error_state);
-out:
+
 	pci_dev_unlock(pdev);
 }
 
@@ -312,6 +329,7 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
 	pr_err("%s: Event 0x%x reports an error for PCI function 0x%x\n",
 	       pdev ? pci_name(pdev) : "n/a", ccdf->pec, ccdf->fid);
 
+
 	if (!pdev)
 		goto no_pdev;
 
@@ -322,12 +340,13 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
 		break;
 	case 0x0040: /* Service Action or Error Recovery Failed */
 	case 0x003b:
-		zpci_event_io_failure(pdev, pci_channel_io_perm_failure);
+		zpci_event_io_failure(pdev, pci_channel_io_perm_failure, ccdf);
 		break;
 	default: /* PCI function left in the error state attempt to recover */
-		ers_res = zpci_event_attempt_error_recovery(pdev);
+		ers_res = zpci_event_attempt_error_recovery(pdev, ccdf);
 		if (ers_res != PCI_ERS_RESULT_RECOVERED)
-			zpci_event_io_failure(pdev, pci_channel_io_perm_failure);
+			zpci_event_io_failure(pdev, pci_channel_io_perm_failure,
+					ccdf);
 		break;
 	}
 	pci_dev_put(pdev);
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index a7bc23ce8483..2be37eab9279 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -168,6 +168,8 @@ void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev)
 
 	zdev->mediated_recovery = false;
 
+	zpci_cleanup_pending_errors(zdev);
+
 	if (!vdev->vdev.kvm)
 		return;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/9] vfio-pci/zdev: Add a device feature for error information
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (5 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 6/9] s390/pci: Store PCI error information for passthrough devices Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 8/9] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 9/9] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali
  8 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

For zPCI devices, we have platform specific error information. The platform
firmware provides this error information to the operating system in an
architecture specific mechanism. To enable recovery from userspace for
these devices, we want to expose this error information to userspace. Add a
new device feature to expose this information.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_core.c |  2 ++
 drivers/vfio/pci/vfio_pci_priv.h |  8 ++++++++
 drivers/vfio/pci/vfio_pci_zdev.c | 34 ++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h        | 14 +++++++++++++
 4 files changed, 58 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 7dcf5439dedc..378adb3226db 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1514,6 +1514,8 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
 		return vfio_pci_core_pm_exit(device, flags, arg, argsz);
 	case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
 		return vfio_pci_core_feature_token(device, flags, arg, argsz);
+	case VFIO_DEVICE_FEATURE_ZPCI_ERROR:
+		return vfio_pci_zdev_feature_err(device, flags, arg, argsz);
 	default:
 		return -ENOTTY;
 	}
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index a9972eacb293..a4a7f97fdc2e 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -86,6 +86,8 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				struct vfio_info_cap *caps);
 int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev);
 void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev);
+int vfio_pci_zdev_feature_err(struct vfio_device *device, u32 flags,
+			      void __user *arg, size_t argsz);
 #else
 static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 					      struct vfio_info_cap *caps)
@@ -100,6 +102,12 @@ static inline int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev)
 
 static inline void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev)
 {}
+
+static int vfio_pci_zdev_feature_err(struct vfio_device *device, u32 flags,
+				     void __user *arg, size_t argsz);
+{
+	return -ENODEV;
+}
 #endif
 
 static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 2be37eab9279..261954039aa9 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -141,6 +141,40 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 	return ret;
 }
 
+int vfio_pci_zdev_feature_err(struct vfio_device *device, u32 flags,
+			      void __user *arg, size_t argsz)
+{
+	struct vfio_device_feature_zpci_err err;
+	struct vfio_pci_core_device *vdev =
+		container_of(device, struct vfio_pci_core_device, vdev);
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+	int ret;
+	int head = 0;
+
+	if (!zdev)
+		return -ENODEV;
+
+	ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
+				 sizeof(err));
+	if (ret != 1)
+		return ret;
+
+	mutex_lock(&zdev->pending_errs_lock);
+	if (zdev->pending_errs.count) {
+		head = zdev->pending_errs.head % ZPCI_ERR_PENDING_MAX;
+		err.pec = zdev->pending_errs.err[head].pec;
+		zdev->pending_errs.head++;
+		zdev->pending_errs.count--;
+		err.pending_errors = zdev->pending_errs.count;
+	}
+	mutex_unlock(&zdev->pending_errs_lock);
+
+	if (copy_to_user(arg, &err, sizeof(err)))
+		return -EFAULT;
+
+	return 0;
+}
+
 int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev)
 {
 	struct zpci_dev *zdev = to_zpci(vdev->pdev);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 75100bf009ba..a950c341602d 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1478,6 +1478,20 @@ struct vfio_device_feature_bus_master {
 };
 #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_ERROR feature provides PCI error information to
+ * userspace for vfio-pci devices on s390x. On s390x PCI error recovery involves
+ * platform firmware and notification to operating system is done by
+ * architecture specific mechanism.  Exposing this information to userspace
+ * allows userspace to take appropriate actions to handle an error on the
+ * device.
+ */
+struct vfio_device_feature_zpci_err {
+	__u16 pec;
+	int pending_errors;
+};
+#define VFIO_DEVICE_FEATURE_ZPCI_ERROR 11
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/9] vfio: Add a reset_done callback for vfio-pci driver
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (6 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 7/9] vfio-pci/zdev: Add a device feature for error information Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  2025-08-25 17:12 ` [PATCH v2 9/9] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali
  8 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

On error recovery for a PCI device bound to vfio-pci driver, we want to
recover the state of the device to its last known saved state. The callback
restores the state of the device to its initial saved state.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 378adb3226db..f2fcb81b3e69 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2241,6 +2241,17 @@ pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev,
 }
 EXPORT_SYMBOL_GPL(vfio_pci_core_aer_err_detected);
 
+static void vfio_pci_core_aer_reset_done(struct pci_dev *pdev)
+{
+	struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
+
+	if (!vdev->pci_saved_state)
+		return;
+
+	pci_load_saved_state(pdev, vdev->pci_saved_state);
+	pci_restore_state(pdev);
+}
+
 int vfio_pci_core_sriov_configure(struct vfio_pci_core_device *vdev,
 				  int nr_virtfn)
 {
@@ -2305,6 +2316,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_sriov_configure);
 
 const struct pci_error_handlers vfio_pci_core_err_handlers = {
 	.error_detected = vfio_pci_core_aer_err_detected,
+	.reset_done = vfio_pci_core_aer_reset_done,
 };
 EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 9/9] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX
  2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
                   ` (7 preceding siblings ...)
  2025-08-25 17:12 ` [PATCH v2 8/9] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
@ 2025-08-25 17:12 ` Farhan Ali
  8 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 17:12 UTC (permalink / raw)
  To: linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, alifm, schnelle, mjrosato

We are configuring the error signaling on the vast majority of devices and
it's extremely rare that it fires anyway. This allows userspace to be
notified on errors for legacy PCI devices. The Internal Share Memory (ISM)
device on s390x is one such device. For PCI devices on IBM s390x error
recovery involves platform firmware and notification to operating system
is done by architecture specific way. So the ISM device can still be
recovered when notified of an error.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_core.c  | 6 ++----
 drivers/vfio/pci/vfio_pci_intrs.c | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index f2fcb81b3e69..d125471fd5ea 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -749,8 +749,7 @@ static int vfio_pci_get_irq_count(struct vfio_pci_core_device *vdev, int irq_typ
 			return (flags & PCI_MSIX_FLAGS_QSIZE) + 1;
 		}
 	} else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX) {
-		if (pci_is_pcie(vdev->pdev))
-			return 1;
+		return 1;
 	} else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX) {
 		return 1;
 	}
@@ -1150,8 +1149,7 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev,
 	case VFIO_PCI_REQ_IRQ_INDEX:
 		break;
 	case VFIO_PCI_ERR_IRQ_INDEX:
-		if (pci_is_pcie(vdev->pdev))
-			break;
+		break;
 		fallthrough;
 	default:
 		return -EINVAL;
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 123298a4dc8f..f2d13b6eb28f 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -838,8 +838,7 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
 	case VFIO_PCI_ERR_IRQ_INDEX:
 		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 		case VFIO_IRQ_SET_ACTION_TRIGGER:
-			if (pci_is_pcie(vdev->pdev))
-				func = vfio_pci_set_err_trigger;
+			func = vfio_pci_set_err_trigger;
 			break;
 		}
 		break;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/9] PCI: Avoid restoring error values in config space
  2025-08-25 17:12 ` [PATCH v2 1/9] PCI: Avoid restoring error values in config space Farhan Ali
@ 2025-08-25 21:35   ` Alex Williamson
  2025-08-25 22:13     ` Farhan Ali
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Williamson @ 2025-08-25 21:35 UTC (permalink / raw)
  To: Farhan Ali; +Cc: linux-s390, kvm, linux-kernel, helgaas, schnelle, mjrosato

On Mon, 25 Aug 2025 10:12:18 -0700
Farhan Ali <alifm@linux.ibm.com> wrote:

> The current reset process saves the device's config space state before
> reset and restores it afterward. However, when a device is in an error
> state before reset, config space reads may return error values instead of
> valid data. This results in saving corrupted values that get written back
> to the device during state restoration. Add validation to prevent writing
> error values to the device when restoring the config space state after
> reset.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>  drivers/pci/pci.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cd..0dd95d782022 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
>  	if (!force && val == saved_val)
>  		return;
>  
> +	if (PCI_POSSIBLE_ERROR(saved_val))
> +		return;
> +
>  	for (;;) {
>  		pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
>  			offset, val, saved_val);


The commit log makes this sound like more than it is.  We're really
only error checking the first 64 bytes of config space before restore,
the capabilities are not checked.  I suppose skipping the BARs and
whatnot is no worse than writing -1 to them, but this is only a
complete solution in the narrow case where we're relying on vfio-pci to
come in and restore the pre-open device state.

I had imagined that pci_save_state() might detect the error state of
the device, avoid setting state_saved, but we'd still perform the
restore callouts that only rely on internal kernel state, maybe adding a
fallback to restore the BARs from resource information.

This implementation serves a purpose, but the commit log should
describe the specific, narrow scenario this solves, and probably also
add a comment in the code about why we're not consistently checking the
saved state for errors.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset
  2025-08-25 17:12 ` [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset Farhan Ali
@ 2025-08-25 21:54   ` Alex Williamson
  2025-08-25 22:28     ` Farhan Ali
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Williamson @ 2025-08-25 21:54 UTC (permalink / raw)
  To: Farhan Ali; +Cc: linux-s390, kvm, linux-kernel, helgaas, schnelle, mjrosato

On Mon, 25 Aug 2025 10:12:19 -0700
Farhan Ali <alifm@linux.ibm.com> wrote:

> If a device is in an error state, then any reads of device registers can
> return error value. Add addtional checks to validate if a device is in an
> error state before doing an flr or pm reset.

I think the thing we see in practice for a device that's wedged and
returning -1 from config space is that the FLR will timeout waiting for
a pending transaction.  So this should fix that, but should we log
something?

I'm assuming AF FLR is not needed here because we don't cache the
offset and therefore won't find the capability when we search the chain
for it.

> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>  drivers/pci/pci.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 0dd95d782022..a07bdb287cf3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4560,12 +4560,17 @@ EXPORT_SYMBOL_GPL(pcie_flr);
>   */
>  int pcie_reset_flr(struct pci_dev *dev, bool probe)
>  {
> +	u32 reg;
> +
>  	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
>  		return -ENOTTY;
>  
>  	if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
>  		return -ENOTTY;
>  
> +	if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &reg))
> +		return -ENOTTY;
> +
>  	if (probe)
>  		return 0;
>  
> @@ -4640,6 +4645,8 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
>  		return -ENOTTY;
>  
>  	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
> +	if (PCI_POSSIBLE_ERROR(csr))
> +		return -ENOTTY;

Doesn't this turn out to be redundant to the test below?

>  	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
>  		return -ENOTTY;
>  

Thanks,
Alex


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/9] PCI: Avoid restoring error values in config space
  2025-08-25 21:35   ` Alex Williamson
@ 2025-08-25 22:13     ` Farhan Ali
  2025-08-26 15:48       ` Alex Williamson
  0 siblings, 1 reply; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 22:13 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, kvm, linux-kernel, helgaas, schnelle, mjrosato


On 8/25/2025 2:35 PM, Alex Williamson wrote:
> On Mon, 25 Aug 2025 10:12:18 -0700
> Farhan Ali <alifm@linux.ibm.com> wrote:
>
>> The current reset process saves the device's config space state before
>> reset and restores it afterward. However, when a device is in an error
>> state before reset, config space reads may return error values instead of
>> valid data. This results in saving corrupted values that get written back
>> to the device during state restoration. Add validation to prevent writing
>> error values to the device when restoring the config space state after
>> reset.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>   drivers/pci/pci.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0f4d98036cd..0dd95d782022 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
>>   	if (!force && val == saved_val)
>>   		return;
>>   
>> +	if (PCI_POSSIBLE_ERROR(saved_val))
>> +		return;
>> +
>>   	for (;;) {
>>   		pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
>>   			offset, val, saved_val);
>
> The commit log makes this sound like more than it is.  We're really
> only error checking the first 64 bytes of config space before restore,
> the capabilities are not checked.  I suppose skipping the BARs and
> whatnot is no worse than writing -1 to them, but this is only a
> complete solution in the narrow case where we're relying on vfio-pci to
> come in and restore the pre-open device state.
>
> I had imagined that pci_save_state() might detect the error state of
> the device, avoid setting state_saved, but we'd still perform the
> restore callouts that only rely on internal kernel state, maybe adding a
> fallback to restore the BARs from resource information.

I initially started with pci_save_state(), and avoid saving the state 
altogether. But that would mean we don't go restore the msix state and 
for s390 don't call arch_restore_msi_irqs(). Do you prefer to avoid 
saving the state at all? This change was small and sufficient enough to 
avoid breaking the device in my testing.

>
> This implementation serves a purpose, but the commit log should
> describe the specific, narrow scenario this solves, and probably also
> add a comment in the code about why we're not consistently checking the
> saved state for errors.  Thanks,
>
> Alex
Yes, I can re-word the commit message.

Thanks
Farhan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset
  2025-08-25 21:54   ` Alex Williamson
@ 2025-08-25 22:28     ` Farhan Ali
  0 siblings, 0 replies; 17+ messages in thread
From: Farhan Ali @ 2025-08-25 22:28 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, kvm, linux-kernel, helgaas, schnelle, mjrosato


On 8/25/2025 2:54 PM, Alex Williamson wrote:
> On Mon, 25 Aug 2025 10:12:19 -0700
> Farhan Ali <alifm@linux.ibm.com> wrote:
>
>> If a device is in an error state, then any reads of device registers can
>> return error value. Add addtional checks to validate if a device is in an
>> error state before doing an flr or pm reset.
> I think the thing we see in practice for a device that's wedged and
> returning -1 from config space is that the FLR will timeout waiting for
> a pending transaction.  So this should fix that, but should we log
> something?

I guess it makes sense to add a warn log.


>
> I'm assuming AF FLR is not needed here because we don't cache the
> offset and therefore won't find the capability when we search the chain
> for it.

Yes, based on my understanding of the when we search for the capability 
offset, we would return 0 if the config space read returns a -1 
(https://elixir.bootlin.com/linux/v6.16.3/source/drivers/pci/pci.c#L441).

>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>   drivers/pci/pci.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 0dd95d782022..a07bdb287cf3 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4560,12 +4560,17 @@ EXPORT_SYMBOL_GPL(pcie_flr);
>>    */
>>   int pcie_reset_flr(struct pci_dev *dev, bool probe)
>>   {
>> +	u32 reg;
>> +
>>   	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
>>   		return -ENOTTY;
>>   
>>   	if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
>>   		return -ENOTTY;
>>   
>> +	if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &reg))
>> +		return -ENOTTY;
>> +
>>   	if (probe)
>>   		return 0;
>>   
>> @@ -4640,6 +4645,8 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
>>   		return -ENOTTY;
>>   
>>   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
>> +	if (PCI_POSSIBLE_ERROR(csr))
>> +		return -ENOTTY;
> Doesn't this turn out to be redundant to the test below?

Yup, I guess i was being extra cautious. Will remove the check.

Thanks
Farhan

>>   	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
>>   		return -ENOTTY;
>>   
> Thanks,
> Alex
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/9] PCI: Avoid restoring error values in config space
  2025-08-25 22:13     ` Farhan Ali
@ 2025-08-26 15:48       ` Alex Williamson
  0 siblings, 0 replies; 17+ messages in thread
From: Alex Williamson @ 2025-08-26 15:48 UTC (permalink / raw)
  To: Farhan Ali; +Cc: linux-s390, kvm, linux-kernel, helgaas, schnelle, mjrosato

On Mon, 25 Aug 2025 15:13:00 -0700
Farhan Ali <alifm@linux.ibm.com> wrote:

> On 8/25/2025 2:35 PM, Alex Williamson wrote:
> > On Mon, 25 Aug 2025 10:12:18 -0700
> > Farhan Ali <alifm@linux.ibm.com> wrote:
> >  
> >> The current reset process saves the device's config space state before
> >> reset and restores it afterward. However, when a device is in an error
> >> state before reset, config space reads may return error values instead of
> >> valid data. This results in saving corrupted values that get written back
> >> to the device during state restoration. Add validation to prevent writing
> >> error values to the device when restoring the config space state after
> >> reset.
> >>
> >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> >> ---
> >>   drivers/pci/pci.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >> index b0f4d98036cd..0dd95d782022 100644
> >> --- a/drivers/pci/pci.c
> >> +++ b/drivers/pci/pci.c
> >> @@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
> >>   	if (!force && val == saved_val)
> >>   		return;
> >>   
> >> +	if (PCI_POSSIBLE_ERROR(saved_val))
> >> +		return;
> >> +
> >>   	for (;;) {
> >>   		pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
> >>   			offset, val, saved_val);  
> >
> > The commit log makes this sound like more than it is.  We're really
> > only error checking the first 64 bytes of config space before restore,
> > the capabilities are not checked.  I suppose skipping the BARs and
> > whatnot is no worse than writing -1 to them, but this is only a
> > complete solution in the narrow case where we're relying on vfio-pci to
> > come in and restore the pre-open device state.
> >
> > I had imagined that pci_save_state() might detect the error state of
> > the device, avoid setting state_saved, but we'd still perform the
> > restore callouts that only rely on internal kernel state, maybe adding a
> > fallback to restore the BARs from resource information.  
> 
> I initially started with pci_save_state(), and avoid saving the state 
> altogether. But that would mean we don't go restore the msix state and 
> for s390 don't call arch_restore_msi_irqs(). Do you prefer to avoid 
> saving the state at all? This change was small and sufficient enough to 
> avoid breaking the device in my testing.

If we're only reading -1 from the device anyway, I'm not sure what
value we're adding to continue to save bogus data from the device.
There are also various restore sub-functions that don't need that saved
state, ex. PASID, PRI, ATS, REBAR, AER, MSI, MSIX, ACS, VF REBAR,
SRIOV.  We could push the state_saved check down into the functions
that do need the prior device state, add warnings and let the remaining
function proceed.  We really need to at least pull BAR values from
resources information for there to be a chance of a functional device
without relying on vfio-pci to restore that though.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions
  2025-08-25 17:12 ` [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions Farhan Ali
@ 2025-08-27  7:50   ` Niklas Schnelle
  0 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2025-08-27  7:50 UTC (permalink / raw)
  To: Farhan Ali, linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, mjrosato

On Mon, 2025-08-25 at 10:12 -0700, Farhan Ali wrote:
> On s390 systems, which use a machine level hypervisor, PCI devices are
> always accessed through a form of PCI pass-through which fundamentally
> operates on a per PCI function granularity. This is also reflected in the
> s390 PCI hotplug driver which creates hotplug slots for individual PCI
> functions. Its reset_slot() function, which is a wrapper for
> zpci_hot_reset_device(), thus also resets individual functions.
> 
> Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
> to multifunction devices. This approach worked fine on s390 systems that
> only exposed virtual functions as individual PCI domains to the operating
> system.  Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
> s390 supports exposing the topology of multifunction PCI devices by
> grouping them in a shared PCI domain. When attempting to reset a function
> through the hotplug driver, the shared slot assignment causes the wrong
> function to be reset instead of the intended one. It also leaks memory as
> we do create a pci_slot object for the function, but don't correctly free
> it in pci_slot_release().
> 
> This patch adds a helper function to allow per function PCI slots for
> functions managed through a hypervisor which exposes individual PCI
> functions while retaining the topology.
> 
> Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
> Co-developed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>  drivers/pci/slot.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
> index 50fb3eb595fe..991526af0ffe 100644
> --- a/drivers/pci/slot.c
> +++ b/drivers/pci/slot.c
> @@ -5,6 +5,7 @@
>   *	Alex Chiang <achiang@hp.com>
>   */
>  
> +#include <linux/hypervisor.h>
>  #include <linux/kobject.h>
>  #include <linux/slab.h>
>  #include <linux/pci.h>
> @@ -73,7 +74,7 @@ static void pci_slot_release(struct kobject *kobj)
>  
>  	down_read(&pci_bus_sem);
>  	list_for_each_entry(dev, &slot->bus->devices, bus_list)
> -		if (PCI_SLOT(dev->devfn) == slot->number)
> +		if (dev->slot == slot->number)
>  			dev->slot = NULL;
>  	up_read(&pci_bus_sem);
>  
> @@ -160,13 +161,25 @@ static int rename_slot(struct pci_slot *slot, const char *name)
>  	return result;
>  }
>  
> +static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
> +{
> +	if (hypervisor_isolated_pci_functions()) {
> +		if (dev->devfn == slot->number)
> +			return true;
> +	} else {
> +		if (PCI_SLOT(dev->devfn) == slot->number)
> +			return true;
> +	}
> +	return false;
> +}
> +
>  void pci_dev_assign_slot(struct pci_dev *dev)
>  {
>  	struct pci_slot *slot;
>  
>  	mutex_lock(&pci_slot_mutex);
>  	list_for_each_entry(slot, &dev->bus->slots, list)
> -		if (PCI_SLOT(dev->devfn) == slot->number)
> +		if (pci_dev_matches_slot(dev, slot))
>  			dev->slot = slot;
>  	mutex_unlock(&pci_slot_mutex);
>  }

Doing some more digging, I believe this also needs adjustment in
pci_dev_reset_slot_function(). Since commit 10791141a6cf ("PCI:
Simplify pci_dev_reset_slot_function()") that no longer directly looks
at the struct pci_slot linking but instead assumes that slot resets
don't work on multifunction devices. With per PCI function slots the
slot reset should work with pdev->multifunction set. I think adjusting
pci_dev_reset_slot_function() may be easier if instead of using the
hypervisor_isolated_pci_functions() helper we would set up a struct
pci_slot::per_func flag as we had considered as an option.

Thanks,
Niklas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device
  2025-08-25 17:12 ` [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device Farhan Ali
@ 2025-08-27 13:27   ` Niklas Schnelle
  0 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2025-08-27 13:27 UTC (permalink / raw)
  To: Farhan Ali, linux-s390, kvm, linux-kernel
  Cc: alex.williamson, helgaas, mjrosato

On Mon, 2025-08-25 at 10:12 -0700, Farhan Ali wrote:
> Commit c1e18c17bda6 ("s390/pci: add zpci_set_irq()/zpci_clear_irq()"),
> introduced the zpci_set_irq() and zpci_clear_irq(), to be used while
> resetting a zPCI device.
> 
> Commit da995d538d3a ("s390/pci: implement reset_slot for hotplug slot"),
> mentions zpci_clear_irq() being called in the path for zpci_hot_reset_device().
> But that is not the case anymore and these functions are not called
> outside of this file.
> 
> However after a CLP disable/enable reset (zpci_hot_reset_device),the airq

Nit: missing space after ","

> setup of the device will need to be restored. Since we are no longer
> calling zpci_clear_airq() in the reset path, we should restore the airq for
> device unconditionally.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>  arch/s390/include/asm/pci.h | 1 -
>  arch/s390/pci/pci_irq.c     | 9 +--------
>  2 files changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 41f900f693d9..aed19a1aa9d7 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -145,7 +145,6 @@ struct zpci_dev {
>  	u8		has_resources	: 1;
>  	u8		is_physfn	: 1;
>  	u8		util_str_avail	: 1;
> -	u8		irqs_registered	: 1;
>  	u8		tid_avail	: 1;
>  	u8		rtr_avail	: 1; /* Relaxed translation allowed */
>  	unsigned int	devfn;		/* DEVFN part of the RID*/
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 84482a921332..e73be96ce5fe 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -107,9 +107,6 @@ static int zpci_set_irq(struct zpci_dev *zdev)
>  	else
>  		rc = zpci_set_airq(zdev);
>  
> -	if (!rc)
> -		zdev->irqs_registered = 1;
> -
>  	return rc;
>  }
>  
> @@ -123,9 +120,6 @@ static int zpci_clear_irq(struct zpci_dev *zdev)
>  	else
>  		rc = zpci_clear_airq(zdev);
>  
> -	if (!rc)
> -		zdev->irqs_registered = 0;
> -
>  	return rc;
>  }
>  
> @@ -427,8 +421,7 @@ bool arch_restore_msi_irqs(struct pci_dev *pdev)
>  {
>  	struct zpci_dev *zdev = to_zpci(pdev);
>  
> -	if (!zdev->irqs_registered)
> -		zpci_set_irq(zdev);
> +	zpci_set_irq(zdev);
>  	return true;
>  }
>  

I dug a bit to see why this isn't a problem for the existing non-vfio
PCI recovery. It looks like the drivers end up calling
arch_teardown_msi_irqs() and then arch_setup_msi_irqs() as part of
their recovery handlers. For example nvme calls nvme_dev_disable() in
error_detected() which calls pci_free_irq_vectors() and ultimately
zpci_clear_irq().

Similarly zpci_set_irq() is ultimately called in
pci_alloc_irq_vectors() in nvme_pci_enable() as part of 
nvme_reset_work().

Additionally zpci_clear_irq() returns success and ignores errors when
the IRQs are already cleared allowing zpci_clear_irq() to set zdev-
>irqs_registered = 0 even if the device is in the error or disabled
state. On the other hand zpci_set_irq() would not ignore trying to
register IRQs if they are already registered.

So I think the commit description is somewhat confusing because the CLP
disable case works if, like with the existing recovery, IRQs get torn
down and setup anew after the reset and because the zpci_clear_irq()
isn't needed in zpci_hot_reset_device() because clp_disable_fh()
already does this. I believe the mention of that was because in an
earlier, never merged, version I had an explicit zpci_clear_irq() but
this was removed because it is redundant, except for flipping the flag
of course.

On the other hand I think the code change itself makes sense. The zdev-
>irqs_registered flag hides when someone tries to register IRQs twice
which I think we would want to know about. And more importantly the
flag doesn't correctly mirror the actual state because CLP disable
doesn't clear the flag but unregisters IRQs and then
arch_restore_msi_irqs() doesn't actually re-regiser IRQs because it
assumes the wrong state. And this is just hidden because none of the
relevant drivers seem to solely rely on pci_restore_state() but do tear
down / setup regardless. I think thus the commit description should
focus on the possibly inconsistent state and arch_restore_msi_irqs()
and then it all makes sense.

Thanks,
Niklas

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-08-27 13:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 17:12 [PATCH v2 0/9] Error recovery for vfio-pci devices on s390x Farhan Ali
2025-08-25 17:12 ` [PATCH v2 1/9] PCI: Avoid restoring error values in config space Farhan Ali
2025-08-25 21:35   ` Alex Williamson
2025-08-25 22:13     ` Farhan Ali
2025-08-26 15:48       ` Alex Williamson
2025-08-25 17:12 ` [PATCH v2 2/9] PCI: Add additional checks for flr and pm reset Farhan Ali
2025-08-25 21:54   ` Alex Williamson
2025-08-25 22:28     ` Farhan Ali
2025-08-25 17:12 ` [PATCH v2 3/9] PCI: Allow per function PCI slots for hypervisor isolated functions Farhan Ali
2025-08-27  7:50   ` Niklas Schnelle
2025-08-25 17:12 ` [PATCH v2 4/9] s390/pci: Restore airq unconditionally for the zPCI device Farhan Ali
2025-08-27 13:27   ` Niklas Schnelle
2025-08-25 17:12 ` [PATCH v2 5/9] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-08-25 17:12 ` [PATCH v2 6/9] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-08-25 17:12 ` [PATCH v2 7/9] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-08-25 17:12 ` [PATCH v2 8/9] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-08-25 17:12 ` [PATCH v2 9/9] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).