public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Farhan Ali <alifm@linux.ibm.com>
To: linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Cc: alex.williamson@redhat.com, helgaas@kernel.org, clg@redhat.com,
	alifm@linux.ibm.com, schnelle@linux.ibm.com,
	mjrosato@linux.ibm.com
Subject: [PATCH v4 00/10] Error recovery for vfio-pci devices on s390x
Date: Wed, 24 Sep 2025 10:16:18 -0700	[thread overview]
Message-ID: <20250924171628.826-1-alifm@linux.ibm.com> (raw)

Hi,

This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x). 

Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace.  But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event. 

Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information 
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving 
a device reset.

I would appreciate some feedback on this series. Part of the series touches
PCI common code, so would like to get some feedback on those patches.

Thanks
Farhan

ChangeLog
---------
v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/
v3 -> v4
    - Remove warn messages for each PCI capability not restored (patch 1)

    - Check PCI_COMMAND and PCI_STATUS register for error value instead of device id 
    (patch 1)

    - Fix kernel crash in patch 3

    - Added reviewed by tags

    - Address comments from Niklas's (patches 4, 5, 7)

    - Fix compilation error non s390x system (patch 8)

    - Explicitly align struct vfio_device_feature_zpci_err (patch 8)


v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
v2 -> v3
   - Patch 1 avoids saving any config space state if the device is in error
   (suggested by Alex)

   - Patch 2 adds additional check only for FLR reset to try other function 
     reset method (suggested by Alex).

   - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
     functions. Creates a new flag pci_slot to allow per function slot.

   - Patch 4 fixes a bug in s390 for resource to bus address translation.

   - Rebase on 6.17-rc5


v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
   - Patches 1 and 2 adds some additional checks for FLR/PM reset to 
     try other function reset method (suggested by Alex).

   - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
     functions.

   - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE 
     ioctl. The ioctl is used by userspace to retriece any PCI error
     information for the device (suggested by Alex).

   - Patch 8 adds a reset_done() callback for the vfio-pci driver, to
     restore the state of the device after a reset.

   - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.


Farhan Ali (10):
  PCI: Avoid saving error values for config space
  PCI: Add additional checks for flr reset
  PCI: Allow per function PCI slots
  s390/pci: Add architecture specific resource/bus address translation
  s390/pci: Restore IRQ unconditionally for the zPCI device
  s390/pci: Update the logic for detecting passthrough device
  s390/pci: Store PCI error information for passthrough devices
  vfio-pci/zdev: Add a device feature for error information
  vfio: Add a reset_done callback for vfio-pci driver
  vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX

 arch/s390/include/asm/pci.h        |  30 +++++++-
 arch/s390/pci/pci.c                |  75 ++++++++++++++++++++
 arch/s390/pci/pci_event.c          | 107 ++++++++++++++++-------------
 arch/s390/pci/pci_irq.c            |   9 +--
 drivers/pci/host-bridge.c          |   4 +-
 drivers/pci/hotplug/s390_pci_hpc.c |  10 ++-
 drivers/pci/pci.c                  |  37 ++++++++--
 drivers/pci/pcie/aer.c             |   3 +
 drivers/pci/pcie/dpc.c             |   3 +
 drivers/pci/pcie/ptm.c             |   3 +
 drivers/pci/slot.c                 |  14 +++-
 drivers/pci/tph.c                  |   3 +
 drivers/pci/vc.c                   |   3 +
 drivers/vfio/pci/vfio_pci_core.c   |  20 ++++--
 drivers/vfio/pci/vfio_pci_intrs.c  |   3 +-
 drivers/vfio/pci/vfio_pci_priv.h   |   8 +++
 drivers/vfio/pci/vfio_pci_zdev.c   |  45 +++++++++++-
 include/linux/pci.h                |   1 +
 include/uapi/linux/vfio.h          |  15 ++++
 19 files changed, 318 insertions(+), 75 deletions(-)

-- 
2.43.0


             reply	other threads:[~2025-09-24 17:16 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 17:16 Farhan Ali [this message]
2025-09-24 17:16 ` [PATCH v4 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-10-01 15:15   ` Benjamin Block
2025-10-01 17:12     ` Farhan Ali
2025-10-02  9:16       ` Benjamin Block
2025-10-04 14:54       ` Lukas Wunner
2025-10-06 17:54         ` Farhan Ali
2025-10-06 19:26           ` Lukas Wunner
2025-10-06 21:35             ` Farhan Ali
2025-10-08 13:34               ` Lukas Wunner
2025-10-08 17:56                 ` Farhan Ali
2025-10-08 18:14                   ` Lukas Wunner
2025-10-08 21:55                     ` Farhan Ali
2025-10-09  4:52                       ` Lukas Wunner
2025-10-09 17:02                         ` Farhan Ali
2025-10-12  6:43                           ` Lukas Wunner
2025-10-09  9:12                     ` Niklas Schnelle
2025-10-12  6:34                       ` Lukas Wunner
2025-10-14 12:07                         ` Niklas Schnelle
2025-10-16 21:00                           ` Farhan Ali
2025-10-19 14:34                           ` Lukas Wunner
2025-10-20  8:59                             ` Niklas Schnelle
2025-11-22 10:58                               ` Lukas Wunner
2025-09-24 17:16 ` [PATCH v4 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-30 10:03   ` Benjamin Block
2025-09-30 17:04     ` Farhan Ali
2025-10-01  8:33       ` Benjamin Block
2025-10-01 14:37   ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-10-01 14:34   ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-25 10:54   ` Niklas Schnelle
2025-10-01 16:04     ` Benjamin Block
2025-10-01 18:01       ` Farhan Ali
2025-10-02 12:58   ` Niklas Schnelle
2025-10-02 17:00     ` Bjorn Helgaas
2025-10-02 17:16       ` Ilpo Järvinen
2025-10-02 18:14       ` Niklas Schnelle
2025-09-24 17:16 ` [PATCH v4 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-25 14:28   ` Niklas Schnelle
2025-09-25 16:29     ` Farhan Ali
2025-09-24 17:16 ` [PATCH v4 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-25  8:04   ` kernel test robot
2025-09-24 17:16 ` [PATCH v4 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-24 17:16 ` [PATCH v4 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250924171628.826-1-alifm@linux.ibm.com \
    --to=alifm@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=clg@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox