public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Farhan Ali <alifm@linux.ibm.com>
To: linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Cc: alex.williamson@redhat.com, helgaas@kernel.org,
	alifm@linux.ibm.com, schnelle@linux.ibm.com,
	mjrosato@linux.ibm.com
Subject: [PATCH v3 00/10] Error recovery for vfio-pci devices on s390x
Date: Thu, 11 Sep 2025 11:32:57 -0700	[thread overview]
Message-ID: <20250911183307.1910-1-alifm@linux.ibm.com> (raw)

Hi,

This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x). 

Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace.  But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event. 

Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information 
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving 
a device reset.

I would appreciate some feedback on this series.

Thanks
Farhan

ChangeLog
---------
v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
v2 -> v3
   - Patch 1 avoids saving any config space state if the device is in error
   (suggested by Alex)

   - Patch 2 adds additional check only for FLR reset to try other function 
     reset method (suggested by Alex).

   - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
     functions. Creates a new flag pci_slot to allow per function slot.

   - Patch 4 fixes a bug in s390 for resource to bus address translation.

   - Rebase on 6.17-rc5


v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
   - Patches 1 and 2 adds some additional checks for FLR/PM reset to 
     try other function reset method (suggested by Alex).

   - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
     functions.

   - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE 
     ioctl. The ioctl is used by userspace to retriece any PCI error
     information for the device (suggested by Alex).

   - Patch 8 adds a reset_done() callback for the vfio-pci driver, to
     restore the state of the device after a reset.

   - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.

Farhan Ali (10):
  PCI: Avoid saving error values for config space
  PCI: Add additional checks for flr reset
  PCI: Allow per function PCI slots
  s390/pci: Add architecture specific resource/bus address translation
  s390/pci: Restore IRQ unconditionally for the zPCI device
  s390/pci: Update the logic for detecting passthrough device
  s390/pci: Store PCI error information for passthrough devices
  vfio-pci/zdev: Add a device feature for error information
  vfio: Add a reset_done callback for vfio-pci driver
  vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX

 arch/s390/include/asm/pci.h        |  30 +++++++-
 arch/s390/pci/pci.c                |  74 ++++++++++++++++++++
 arch/s390/pci/pci_event.c          | 107 ++++++++++++++++-------------
 arch/s390/pci/pci_irq.c            |   9 +--
 drivers/pci/host-bridge.c          |   4 +-
 drivers/pci/hotplug/s390_pci_hpc.c |  10 ++-
 drivers/pci/pci.c                  |  40 +++++++++--
 drivers/pci/pcie/aer.c             |   5 ++
 drivers/pci/pcie/dpc.c             |   5 ++
 drivers/pci/pcie/ptm.c             |   5 ++
 drivers/pci/slot.c                 |  14 +++-
 drivers/pci/tph.c                  |   5 ++
 drivers/pci/vc.c                   |   5 ++
 drivers/vfio/pci/vfio_pci_core.c   |  20 ++++--
 drivers/vfio/pci/vfio_pci_intrs.c  |   3 +-
 drivers/vfio/pci/vfio_pci_priv.h   |   8 +++
 drivers/vfio/pci/vfio_pci_zdev.c   |  45 +++++++++++-
 include/linux/pci.h                |   1 +
 include/uapi/linux/vfio.h          |  14 ++++
 19 files changed, 330 insertions(+), 74 deletions(-)

-- 
2.43.0


             reply	other threads:[~2025-09-11 18:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11 18:32 Farhan Ali [this message]
2025-09-11 18:32 ` [PATCH v3 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-09-13  8:27   ` Alex Williamson
2025-09-15 17:15     ` Farhan Ali
2025-09-16 18:09   ` Bjorn Helgaas
2025-09-16 20:00     ` Farhan Ali
2025-09-19 18:17       ` Alex Williamson
2025-09-11 18:32 ` [PATCH v3 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-11 18:33 ` [PATCH v3 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-09-12 12:23   ` Benjamin Block
2025-09-12 17:19     ` Farhan Ali
2025-09-16  6:52   ` Cédric Le Goater
2025-09-16 18:37     ` Farhan Ali
2025-09-17  6:21       ` Cédric Le Goater
2025-09-17 17:50         ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-17 14:48   ` Niklas Schnelle
2025-09-17 17:22     ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-15  8:39   ` Niklas Schnelle
2025-09-15 17:42     ` Farhan Ali
2025-09-16 10:59       ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-15  9:22   ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-15 11:42   ` Niklas Schnelle
2025-09-15 18:12     ` Farhan Ali
2025-09-16 10:54       ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-13  9:04   ` Alex Williamson
2025-09-15 18:27     ` Farhan Ali
2025-09-15  6:26   ` Cédric Le Goater
2025-09-15 18:27     ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-11 18:33 ` [PATCH v3 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250911183307.1910-1-alifm@linux.ibm.com \
    --to=alifm@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox