From: Bjorn Helgaas <helgaas@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
Riana Tauro <riana.tauro@intel.com>,
"Sean C. Dardis" <sean.c.dardis@intel.com>,
Farhan Ali <alifm@linux.ibm.com>,
Benjamin Block <bblock@linux.ibm.com>,
Niklas Schnelle <schnelle@linux.ibm.com>,
Alek Du <alek.du@intel.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Oliver OHalloran <oohall@gmail.com>,
linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
Giovanni Cabiddu <giovanni.cabiddu@intel.com>,
qat-linux@intel.com, Dave Jiang <dave.jiang@intel.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jiri Slaby <jirislaby@kernel.org>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH 0/2] PCI: Universal error recoverability of devices
Date: Fri, 14 Nov 2025 17:45:43 -0600 [thread overview]
Message-ID: <20251114234543.GA2350415@bhelgaas> (raw)
In-Reply-To: <cover.1760274044.git.lukas@wunner.de>
On Sun, Oct 12, 2025 at 03:25:00PM +0200, Lukas Wunner wrote:
> When PCI devices are reset -- either to recover from an error or
> after a D3hot/D3cold transition -- their Config Space needs to be
> restored.
>
> D3hot/D3cold transitions happen under the control of the kernel,
> hence it is able to save Config Space before and restore it afterwards.
>
> However errors may occur unexpectedly and it may then be impossible
> to save Config Space because the device may be inaccessible (e.g. DPC)
> or Config Space may be corrupted. So it must be saved ahead of time.
>
> This isn't done consistently because the PCI core doesn't take care
> of it and only a subset of drivers do. The situation is aggravated
> by the behavior of pci_restore_state(), which only allows restoring
> Config Space once and invalidates the saved copy afterwards.
>
> Solve all these problems by saving an initial copy of Config Space
> on device addition which drivers may update if they change registers.
> Modify pci_restore_state() to allow using the saved copy indefinitely
> and drop all the workarounds for its previous behavior that have
> accumulated in the tree.
>
> Lukas Wunner (2):
> PCI: Ensure error recoverability at all times
> treewide: Drop pci_save_state() after pci_restore_state()
>
> drivers/crypto/intel/qat/qat_common/adf_aer.c | 2 --
> drivers/dma/ioat/init.c | 1 -
> drivers/net/ethernet/broadcom/bnx2.c | 2 --
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 1 -
> drivers/net/ethernet/broadcom/tg3.c | 1 -
> drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 1 -
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 2 --
> drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 1 -
> drivers/net/ethernet/intel/e1000e/netdev.c | 1 -
> drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 6 ------
> drivers/net/ethernet/intel/i40e/i40e_main.c | 1 -
> drivers/net/ethernet/intel/ice/ice_main.c | 2 --
> drivers/net/ethernet/intel/igb/igb_main.c | 2 --
> drivers/net/ethernet/intel/igc/igc_main.c | 2 --
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 1 -
> drivers/net/ethernet/mellanox/mlx4/main.c | 1 -
> drivers/net/ethernet/mellanox/mlx5/core/main.c | 1 -
> drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 1 -
> drivers/net/ethernet/microchip/lan743x_main.c | 1 -
> drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 4 ----
> drivers/net/ethernet/neterion/s2io.c | 1 -
> drivers/pci/bus.c | 7 +++++++
> drivers/pci/pci.c | 3 ---
> drivers/pci/pcie/portdrv.c | 1 -
> drivers/pci/probe.c | 2 --
> drivers/scsi/bfa/bfad.c | 1 -
> drivers/scsi/csiostor/csio_init.c | 1 -
> drivers/scsi/ipr.c | 1 -
> drivers/scsi/lpfc/lpfc_init.c | 6 ------
> drivers/scsi/qla2xxx/qla_os.c | 5 -----
> drivers/scsi/qla4xxx/ql4_os.c | 5 -----
> drivers/tty/serial/8250/8250_pci.c | 1 -
> drivers/tty/serial/jsm/jsm_driver.c | 1 -
> 33 files changed, 7 insertions(+), 62 deletions(-)
Applied to pci/err, maybe for v6.19?
It touches a lot of drivers, so it'd be nice to have more time in
-next, but it is mostly in error recovery paths that aren't going to
be exercised much anyway.
I'll watch for a minor update of comments and update if I see it.
Thanks a lot for your work and description of this. It's a big step
in my understanding of PM and error recovery. Which still leaves me
mostly ignorant, just slightly less so.
Bjorn
prev parent reply other threads:[~2025-11-14 23:45 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-12 13:25 [PATCH 0/2] PCI: Universal error recoverability of devices Lukas Wunner
2025-10-12 13:25 ` [PATCH 1/2] PCI: Ensure error recoverability at all times Lukas Wunner
2025-11-12 22:38 ` Bjorn Helgaas
2025-11-13 9:38 ` Lukas Wunner
2025-11-13 16:15 ` Bjorn Helgaas
2025-11-14 18:58 ` Lukas Wunner
2025-11-14 23:39 ` Bjorn Helgaas
2025-11-19 10:02 ` Lukas Wunner
2025-11-21 17:40 ` Lukas Wunner
2025-11-24 22:11 ` Bjorn Helgaas
2025-11-13 20:49 ` Rafael J. Wysocki
2025-11-13 21:03 ` Rafael J. Wysocki
2025-10-12 13:25 ` [PATCH 2/2] treewide: Drop pci_save_state() after pci_restore_state() Lukas Wunner
2025-11-05 14:22 ` Dave Jiang
2025-11-05 14:33 ` Giovanni Cabiddu
2025-11-24 23:13 ` Bjorn Helgaas
2025-11-14 23:45 ` Bjorn Helgaas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251114234543.GA2350415@bhelgaas \
--to=helgaas@kernel.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=alek.du@intel.com \
--cc=alifm@linux.ibm.com \
--cc=andrew+netdev@lunn.ch \
--cc=bblock@linux.ibm.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=giovanni.cabiddu@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=jirislaby@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=martin.petersen@oracle.com \
--cc=oohall@gmail.com \
--cc=pabeni@redhat.com \
--cc=qat-linux@intel.com \
--cc=rafael@kernel.org \
--cc=riana.tauro@intel.com \
--cc=schnelle@linux.ibm.com \
--cc=sean.c.dardis@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).