From: Alex Williamson <alex.williamson@redhat.com>
To: Farhan Ali <alifm@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
helgaas@kernel.org, schnelle@linux.ibm.com,
mjrosato@linux.ibm.com
Subject: Re: [PATCH v3 01/10] PCI: Avoid saving error values for config space
Date: Sat, 13 Sep 2025 09:27:09 +0100 [thread overview]
Message-ID: <20250913092709.2e58782d.alex.williamson@redhat.com> (raw)
In-Reply-To: <20250911183307.1910-2-alifm@linux.ibm.com>
On Thu, 11 Sep 2025 11:32:58 -0700
Farhan Ali <alifm@linux.ibm.com> wrote:
> The current reset process saves the device's config space state before
> reset and restores it afterward. However, when a device is in an error
> state before reset, config space reads may return error values instead of
> valid data. This results in saving corrupted values that get written back
> to the device during state restoration.
>
> Avoid saving the state of the config space when the device is in error.
> While restoring we only restorei the state that can be restored through
s/restorei/restore/
> kernel data such as BARs or doesn't depend on the saved state.
>
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
> drivers/pci/pci.c | 29 ++++++++++++++++++++++++++---
> drivers/pci/pcie/aer.c | 5 +++++
> drivers/pci/pcie/dpc.c | 5 +++++
> drivers/pci/pcie/ptm.c | 5 +++++
> drivers/pci/tph.c | 5 +++++
> drivers/pci/vc.c | 5 +++++
> 6 files changed, 51 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cd..4b67d22faf0a 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1720,6 +1720,11 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
> struct pci_cap_saved_state *save_state;
> u16 *cap;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring pcie state, no saved state");
> + return;
> + }
> +
> /*
> * Restore max latencies (in the LTR capability) before enabling
> * LTR itself in PCI_EXP_DEVCTL2.
> @@ -1775,6 +1780,11 @@ static void pci_restore_pcix_state(struct pci_dev *dev)
> struct pci_cap_saved_state *save_state;
> u16 *cap;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring pcix state, no saved state");
> + return;
> + }
> +
> save_state = pci_find_saved_cap(dev, PCI_CAP_ID_PCIX);
> pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
> if (!save_state || !pos)
> @@ -1792,6 +1802,14 @@ static void pci_restore_pcix_state(struct pci_dev *dev)
> int pci_save_state(struct pci_dev *dev)
> {
> int i;
> + u16 val;
> +
> + pci_read_config_word(dev, PCI_DEVICE_ID, &val);
> + if (PCI_POSSIBLE_ERROR(val)) {
> + pci_warn(dev, "Device in error, not saving config space state\n");
> + return -EIO;
> + }
> +
I don't think this works with standard VFs, per the spec the device ID
register returns 0xFFFF. Likely need to look for a CRS or error status
across both vendor and device ID registers.
We could be a little more formal and specific describing the skipped
states too, ex. "PCIe capability", "PCI-X capability", "PCI AER
capability", etc. Thanks,
Alex
> /* XXX: 100% dword access ok here? */
> for (i = 0; i < 16; i++) {
> pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> @@ -1854,6 +1872,14 @@ static void pci_restore_config_space_range(struct pci_dev *pdev,
>
> static void pci_restore_config_space(struct pci_dev *pdev)
> {
> + if (!pdev->state_saved) {
> + pci_warn(pdev, "No saved config space, restoring BARs\n");
> + pci_restore_bars(pdev);
> + pci_write_config_word(pdev, PCI_COMMAND,
> + PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
> + return;
> + }
> +
> if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL) {
> pci_restore_config_space_range(pdev, 10, 15, 0, false);
> /* Restore BARs before the command register. */
> @@ -1906,9 +1932,6 @@ static void pci_restore_rebar_state(struct pci_dev *pdev)
> */
> void pci_restore_state(struct pci_dev *dev)
> {
> - if (!dev->state_saved)
> - return;
> -
> pci_restore_pcie_state(dev);
> pci_restore_pasid_state(dev);
> pci_restore_pri_state(dev);
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index e286c197d716..dca3502ef669 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -361,6 +361,11 @@ void pci_restore_aer_state(struct pci_dev *dev)
> if (!aer)
> return;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring aer state, no saved state");
> + return;
> + }
> +
> save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_ERR);
> if (!save_state)
> return;
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index fc18349614d7..62c520af71a7 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -67,6 +67,11 @@ void pci_restore_dpc_state(struct pci_dev *dev)
> if (!pci_is_pcie(dev))
> return;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring dpc state, no saved state");
> + return;
> + }
> +
> save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_DPC);
> if (!save_state)
> return;
> diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> index 65e4b008be00..7b5bcc23000d 100644
> --- a/drivers/pci/pcie/ptm.c
> +++ b/drivers/pci/pcie/ptm.c
> @@ -112,6 +112,11 @@ void pci_restore_ptm_state(struct pci_dev *dev)
> if (!ptm)
> return;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring ptm state, no saved state");
> + return;
> + }
> +
> save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_PTM);
> if (!save_state)
> return;
> diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
> index cc64f93709a4..f0f1bae46736 100644
> --- a/drivers/pci/tph.c
> +++ b/drivers/pci/tph.c
> @@ -435,6 +435,11 @@ void pci_restore_tph_state(struct pci_dev *pdev)
> if (!pdev->tph_enabled)
> return;
>
> + if (!pdev->state_saved) {
> + pci_warn(pdev, "Not restoring tph state, no saved state");
> + return;
> + }
> +
> save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
> if (!save_state)
> return;
> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c
> index a4ff7f5f66dd..fda435cd49c1 100644
> --- a/drivers/pci/vc.c
> +++ b/drivers/pci/vc.c
> @@ -391,6 +391,11 @@ void pci_restore_vc_state(struct pci_dev *dev)
> {
> int i;
>
> + if (!dev->state_saved) {
> + pci_warn(dev, "Not restoring vc state, no saved state");
> + return;
> + }
> +
> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) {
> int pos;
> struct pci_cap_saved_state *save_state;
next prev parent reply other threads:[~2025-09-13 8:27 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 18:32 [PATCH v3 00/10] Error recovery for vfio-pci devices on s390x Farhan Ali
2025-09-11 18:32 ` [PATCH v3 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-09-13 8:27 ` Alex Williamson [this message]
2025-09-15 17:15 ` Farhan Ali
2025-09-16 18:09 ` Bjorn Helgaas
2025-09-16 20:00 ` Farhan Ali
2025-09-19 18:17 ` Alex Williamson
2025-09-11 18:32 ` [PATCH v3 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-11 18:33 ` [PATCH v3 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-09-12 12:23 ` Benjamin Block
2025-09-12 17:19 ` Farhan Ali
2025-09-16 6:52 ` Cédric Le Goater
2025-09-16 18:37 ` Farhan Ali
2025-09-17 6:21 ` Cédric Le Goater
2025-09-17 17:50 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-17 14:48 ` Niklas Schnelle
2025-09-17 17:22 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-15 8:39 ` Niklas Schnelle
2025-09-15 17:42 ` Farhan Ali
2025-09-16 10:59 ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-15 9:22 ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-15 11:42 ` Niklas Schnelle
2025-09-15 18:12 ` Farhan Ali
2025-09-16 10:54 ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-13 9:04 ` Alex Williamson
2025-09-15 18:27 ` Farhan Ali
2025-09-15 6:26 ` Cédric Le Goater
2025-09-15 18:27 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-11 18:33 ` [PATCH v3 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250913092709.2e58782d.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=alifm@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=schnelle@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox