public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Farhan Ali <alifm@linux.ibm.com>,
	Benjamin Block <bblock@linux.ibm.com>,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	alex.williamson@redhat.com, helgaas@kernel.org, clg@redhat.com,
	mjrosato@linux.ibm.com
Subject: Re: [PATCH v4 01/10] PCI: Avoid saving error values for config space
Date: Sun, 19 Oct 2025 16:34:17 +0200	[thread overview]
Message-ID: <aPT26UZ41DsN5C01@wunner.de> (raw)
In-Reply-To: <bb59edee909ceb09527cedec10896d45126f0027.camel@linux.ibm.com>

On Tue, Oct 14, 2025 at 02:07:57PM +0200, Niklas Schnelle wrote:
> On Sun, 2025-10-12 at 08:34 +0200, Lukas Wunner wrote:
> > If you do want to stick with your alternative approach,
> > maybe doing the error handling in the ->mmio_enabled() phase
> > instead of ->error_detected() would make more sense.
> > In that phase you're allowed to access the device,
> > you can also attempt a local reset and return
> > PCI_ERS_RESULT_RECOVERED on success.
> > 
> > You'd have to return PCI_ERS_RESULT_CAN_RECOVER though
> > from the ->error_detected() callback in order to progress
> > to the ->mmio_enabled() step.
> 
> The problem with using ->mmio_enabled() is two fold. For one we
> sometimes have to do a reset instead of clearing the error state, for
> example if the device was not only put in the error state but also
> disabled, or if the guest driver wants it,

Well in that case you could reset the device in the ->mmio_enabled() step
from the guest using the vfio reset ioctl.

> Second and more
> importantly this would break the guests assumption that the device will
> be in the error state with MMIO and DMA blocked when it gets an error
> event. On the other hand, that's exactly the state it is in if we
> report the error in the ->error_detected() callback

At the risk of continuously talking past each other:

How about this, the host notifies the guest of the error in the
->error_detected() callback.  The guest notifies the driver and
collects the result (whether a reset is requested or not), then
returns PCI_ERS_RESULT_CAN_RECOVER to the host.

The host re-enables I/O to the device, invokes the ->mmio_detected()
callback.  The guest then resets the device based on the result it
collected earlier or invokes the driver's ->mmio_enabled() callback.

If the driver returns PCI_ERS_RESULT_NEED_RESET from the
->mmio_enabled() callback, you can likewise reset the device from
the guest using the ioctl method.

My concern is that by insisting that you handle device recovery
completely in the ->error_detected() phase, you're not complying
with the protocol specified in Documentation/PCI/pci-error-recovery.rst
and as a result, you have to amend the reset code in the PCI core
because it assumes that all arches adheres to the protocol.
In my view, that suggests that the approach needs to be reworked
to comply with the protocol.  Then the workarounds for performing
a reset while I/O is blocked become unnecessary.

Thanks,

Lukas

  parent reply	other threads:[~2025-10-19 14:34 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 17:16 [PATCH v4 00/10] Error recovery for vfio-pci devices on s390x Farhan Ali
2025-09-24 17:16 ` [PATCH v4 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-10-01 15:15   ` Benjamin Block
2025-10-01 17:12     ` Farhan Ali
2025-10-02  9:16       ` Benjamin Block
2025-10-04 14:54       ` Lukas Wunner
2025-10-06 17:54         ` Farhan Ali
2025-10-06 19:26           ` Lukas Wunner
2025-10-06 21:35             ` Farhan Ali
2025-10-08 13:34               ` Lukas Wunner
2025-10-08 17:56                 ` Farhan Ali
2025-10-08 18:14                   ` Lukas Wunner
2025-10-08 21:55                     ` Farhan Ali
2025-10-09  4:52                       ` Lukas Wunner
2025-10-09 17:02                         ` Farhan Ali
2025-10-12  6:43                           ` Lukas Wunner
2025-10-09  9:12                     ` Niklas Schnelle
2025-10-12  6:34                       ` Lukas Wunner
2025-10-14 12:07                         ` Niklas Schnelle
2025-10-16 21:00                           ` Farhan Ali
2025-10-19 14:34                           ` Lukas Wunner [this message]
2025-10-20  8:59                             ` Niklas Schnelle
2025-11-22 10:58                               ` Lukas Wunner
2025-09-24 17:16 ` [PATCH v4 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-30 10:03   ` Benjamin Block
2025-09-30 17:04     ` Farhan Ali
2025-10-01  8:33       ` Benjamin Block
2025-10-01 14:37   ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-10-01 14:34   ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-25 10:54   ` Niklas Schnelle
2025-10-01 16:04     ` Benjamin Block
2025-10-01 18:01       ` Farhan Ali
2025-10-02 12:58   ` Niklas Schnelle
2025-10-02 17:00     ` Bjorn Helgaas
2025-10-02 17:16       ` Ilpo Järvinen
2025-10-02 18:14       ` Niklas Schnelle
2025-09-24 17:16 ` [PATCH v4 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-25 14:28   ` Niklas Schnelle
2025-09-25 16:29     ` Farhan Ali
2025-09-24 17:16 ` [PATCH v4 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-25  8:04   ` kernel test robot
2025-09-24 17:16 ` [PATCH v4 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-24 17:16 ` [PATCH v4 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPT26UZ41DsN5C01@wunner.de \
    --to=lukas@wunner.de \
    --cc=alex.williamson@redhat.com \
    --cc=alifm@linux.ibm.com \
    --cc=bblock@linux.ibm.com \
    --cc=clg@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox