From: Lukas Wunner <lukas@wunner.de>
To: Farhan Ali <alifm@linux.ibm.com>
Cc: Benjamin Block <bblock@linux.ibm.com>,
linux-s390@vger.kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
alex.williamson@redhat.com, helgaas@kernel.org, clg@redhat.com,
schnelle@linux.ibm.com, mjrosato@linux.ibm.com
Subject: Re: [PATCH v4 01/10] PCI: Avoid saving error values for config space
Date: Wed, 8 Oct 2025 15:34:16 +0200 [thread overview]
Message-ID: <aOZoWDQV0TNh-NiM@wunner.de> (raw)
In-Reply-To: <8c14d648-453c-4426-af69-4e911a1128c1@linux.ibm.com>
On Mon, Oct 06, 2025 at 02:35:49PM -0700, Farhan Ali wrote:
> On 10/6/2025 12:26 PM, Lukas Wunner wrote:
> > On Mon, Oct 06, 2025 at 10:54:51AM -0700, Farhan Ali wrote:
> > > On 10/4/2025 7:54 AM, Lukas Wunner wrote:
> > > > I believe this also makes patch [01/10] in your series unnecessary.
> > > I tested your patches + patches 2-10 of this series. It unfortunately didn't
> > > completely help with the s390x use case. We still need the check to in
> > > pci_save_state() from this patch to make sure we are not saving error
> > > values, which can be written back to the device in pci_restore_state().
> > What's the caller of pci_save_state() that needs this?
> >
> > Can you move the check for PCI_POSSIBLE_ERROR() to the caller?
> > I think plenty of other callers don't need this, so it adds
> > extra overhead for them and down the road it'll be difficult
> > to untangle which caller needs it and which doesn't.
>
> The caller would be pci_dev_save_and_disable(). Are you suggesting moving
> the PCI_POSSIBLE_ERROR() prior to calling pci_save_state()?
I'm not sure yet. Let's back up a little: I'm missing an
architectural description how you're intending to do error
recovery in the VM. If I understand correctly, you're
informing the VM of the error via the ->error_detected() callback.
You're saying you need to check for accessibility of the device
prior to resetting it from the VM, does that mean you're attempting
a reset from the ->error_detected() callback?
According to Documentation/PCI/pci-error-recovery.rst, the device
isn't supposed to be considered accessible in ->error_detected().
The first callback which allows access is ->mmio_enabled().
I also don't quite understand why the VM needs to perform a reset.
Why can't you just let the VM tell the host that a reset is needed
(PCI_ERS_RESULT_NEED_RESET) and then the host resets the device on
behalf of the VM?
Furthermore, I'm thinking that you should be using pci_channel_offline()
to detect accessibility of the device, rather than reading from
Config Space and checking for PCI_POSSIBLE_ERROR().
> > The state saved on device addition is just the initial state and
> > it is fine if later on it gets updated (which is a nicer term than
> > "overwritten"). E.g. when portdrv.c instantiates port services
> > and drivers are bound to them, various registers in Config Space
> > are changed, hence pcie_portdrv_probe() calls pci_save_state()
> > again.
> >
> > However we can discuss whether pci_save_state() is still needed
> > in pci_dev_save_and_disable().
>
> The commit 8dd7f8036c12 ("PCI: add support for function level reset")
> introduced the logic of saving/restoring the device state after an FLR. My
> assumption is it was done to save the most recent state of the device (as
> the state could be updated by drivers). So I think it would still make sense
> to save the device state in pci_dev_save_and_disable() if the Config Space
> is still accessible?
Yes, right now we can't assume that drivers call pci_save_state()
in their probe hook if they modified Config Space. They may rely
on the state being saved prior to reset or a D3hot/D3cold transition.
So we need to keep the pci_dev_save_and_disable() call for now.
Generally the expectation is that Config Space is accessible when
performing a reset with pci_try_reset_function(). Since that's
apparently not guaranteed for your use case, I'm wondering if you
might be using the function in a context it's not supposed to be used.
Thanks,
Lukas
next prev parent reply other threads:[~2025-10-08 13:34 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-24 17:16 [PATCH v4 00/10] Error recovery for vfio-pci devices on s390x Farhan Ali
2025-09-24 17:16 ` [PATCH v4 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-10-01 15:15 ` Benjamin Block
2025-10-01 17:12 ` Farhan Ali
2025-10-02 9:16 ` Benjamin Block
2025-10-04 14:54 ` Lukas Wunner
2025-10-06 17:54 ` Farhan Ali
2025-10-06 19:26 ` Lukas Wunner
2025-10-06 21:35 ` Farhan Ali
2025-10-08 13:34 ` Lukas Wunner [this message]
2025-10-08 17:56 ` Farhan Ali
2025-10-08 18:14 ` Lukas Wunner
2025-10-08 21:55 ` Farhan Ali
2025-10-09 4:52 ` Lukas Wunner
2025-10-09 17:02 ` Farhan Ali
2025-10-12 6:43 ` Lukas Wunner
2025-10-09 9:12 ` Niklas Schnelle
2025-10-12 6:34 ` Lukas Wunner
2025-10-14 12:07 ` Niklas Schnelle
2025-10-16 21:00 ` Farhan Ali
2025-10-19 14:34 ` Lukas Wunner
2025-10-20 8:59 ` Niklas Schnelle
2025-11-22 10:58 ` Lukas Wunner
2025-09-24 17:16 ` [PATCH v4 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-30 10:03 ` Benjamin Block
2025-09-30 17:04 ` Farhan Ali
2025-10-01 8:33 ` Benjamin Block
2025-10-01 14:37 ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-10-01 14:34 ` Benjamin Block
2025-09-24 17:16 ` [PATCH v4 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-25 10:54 ` Niklas Schnelle
2025-10-01 16:04 ` Benjamin Block
2025-10-01 18:01 ` Farhan Ali
2025-10-02 12:58 ` Niklas Schnelle
2025-10-02 17:00 ` Bjorn Helgaas
2025-10-02 17:16 ` Ilpo Järvinen
2025-10-02 18:14 ` Niklas Schnelle
2025-09-24 17:16 ` [PATCH v4 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-24 17:16 ` [PATCH v4 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-25 14:28 ` Niklas Schnelle
2025-09-25 16:29 ` Farhan Ali
2025-09-24 17:16 ` [PATCH v4 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-25 8:04 ` kernel test robot
2025-09-24 17:16 ` [PATCH v4 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-24 17:16 ` [PATCH v4 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOZoWDQV0TNh-NiM@wunner.de \
--to=lukas@wunner.de \
--cc=alex.williamson@redhat.com \
--cc=alifm@linux.ibm.com \
--cc=bblock@linux.ibm.com \
--cc=clg@redhat.com \
--cc=helgaas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=schnelle@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.