All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Riana Tauro <riana.tauro@intel.com>,
	"Sean C. Dardis" <sean.c.dardis@intel.com>,
	Farhan Ali <alifm@linux.ibm.com>,
	Benjamin Block <bblock@linux.ibm.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Alek Du <alek.du@intel.com>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver OHalloran <oohall@gmail.com>,
	linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
	Giovanni Cabiddu <giovanni.cabiddu@intel.com>,
	qat-linux@intel.com, Dave Jiang <dave.jiang@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jirislaby@kernel.org>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH 1/2] PCI: Ensure error recoverability at all times
Date: Thu, 13 Nov 2025 10:38:09 +0100	[thread overview]
Message-ID: <aRWnAd-PZuHMqBwd@wunner.de> (raw)
In-Reply-To: <20251112223831.GA2245026@bhelgaas>

On Wed, Nov 12, 2025 at 04:38:31PM -0600, Bjorn Helgaas wrote:
> On Sun, Oct 12, 2025 at 03:25:01PM +0200, Lukas Wunner wrote:
> > Despite these workarounds, recoverability at all times is not guaranteed:
> > E.g. when a PCIe port goes through a runtime suspend and resume cycle,
> > the "saved_state" flag is cleared by:
> > 
> >   pci_pm_runtime_resume()
> >     pci_pm_default_resume_early()
> >       pci_restore_state()
> > 
> > ... and hence on a subsequent AER event, the port's Config Space cannot be
> > restored.  
> 
> I guess this restore would be done by a driver's
> pci_error_handlers.slot_reset() or .reset_done() calling
> pci_restore_state()?

Yes.  Restoration of config space after an error-recovery-induced reset
is currently always the job of the device driver.

E.g. in the case of portdrv, it happens in pcie_portdrv_slot_reset().

We could revisit this design decision and change the behavior to have
pcie_do_recovery() call pci_restore_state(), thus reducing boilerplate
in the drivers.  But that would be a separate effort, orthogonal to the
present patch.

> > +++ b/drivers/pci/bus.c
> > @@ -358,6 +358,13 @@ void pci_bus_add_device(struct pci_dev *dev)
> >  	pci_bridge_d3_update(dev);
> >  
> >  	/*
> > +	 * Save config space for error recoverability.  Clear state_saved
> > +	 * to detect whether drivers invoked pci_save_state() on suspend.
> 
> Can we expand this a little to explain how this is detected and what
> drivers *should* be doing?

That is documented in Documentation/power/pci.rst, "3.1.2. suspend()":

   "This callback is expected to quiesce the device and prepare it to be
    put into a low-power state by the PCI subsystem.  It is not required
    (in fact it even is not recommended) that a PCI driver's suspend()
    callback save the standard configuration registers of the device [...]

    However, in some rare case it is convenient to carry out these
    operations in a PCI driver.  Then, pci_save_state() [...] should be
    used to save the device's standard configuration registers [...].
    Moreover, if the driver calls pci_save_state(), the PCI subsystem will
    not execute either pci_prepare_to_sleep(), or pci_set_power_state()
    for its device, so the driver is then responsible for handling the
    device as appropriate."

> I think the reason is that the PCI core
> can invoke pci_save_state() on suspend if the driver did not.

Right.  By calling pci_save_state(), a driver signals to the PCI core
that it assumes responsibility for putting the device into a low power
state.  If a driver wants to keep a device in D0, it could call
pci_save_state() and thus prevent the PCI core from putting it e.g.
into D3.

> I assume:
> 
>   - PCI core always calls pci_save_state() and clears state_saved when
>     device is enumerated (below)
> 
>   - When it has configured the device to the state it wants restore,
>     the driver may call pci_save_state() again, which will set
>     state_saved
> 
>   - If driver has not called pci_save_state(), i.e., state_saved is
>     still clear, we want the PCI core to call pci_save_state() during
>     suspend

Right.

> This sounds sensible to me.  It would be nice if there were a few more
> words about pci_save_state() and pci_restore_state() in
> Documentation/.
> 
> pci_save_state() isn't mentioned at all in Documentation/PCI

Right, it's documented in the Documentation/power directory. :)

The "state_saved" flag in struct pci_dev is an internal flag used by
the PCI core to keep track of whether a driver called pci_save_state()
on suspend.

The logic to update the flag is not modified by the patch, deliberately so
to avoid any breakage.  The flag is currently initialized to false in
pci_device_add() (even though it already is false due to kzalloc() zeroing
the memory).  I'm now later calling pci_save_state() in pci_bus_add_device(),
which sets the flag to true.  To preserve the existing logic, I am resetting
the flag to false again.

The only change made by the patch is to not invalidate the saved state
upon pci_restore_state() and thus allow re-using it for error recovery.
The patch seeks to avoid changing the behavior of suspend/resume.
I wanted to keep this minimal, non-intrusive and as low risk as possible.

Thanks,

Lukas

  reply	other threads:[~2025-11-13  9:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-12 13:25 [PATCH 0/2] PCI: Universal error recoverability of devices Lukas Wunner
2025-10-12 13:25 ` [PATCH 1/2] PCI: Ensure error recoverability at all times Lukas Wunner
2025-11-12 22:38   ` Bjorn Helgaas
2025-11-13  9:38     ` Lukas Wunner [this message]
2025-11-13 16:15       ` Bjorn Helgaas
2025-11-14 18:58         ` Lukas Wunner
2025-11-14 23:39           ` Bjorn Helgaas
2025-11-19 10:02             ` Lukas Wunner
2025-11-21 17:40         ` Lukas Wunner
2025-11-24 22:11           ` Bjorn Helgaas
2025-11-13 20:49   ` Rafael J. Wysocki
2025-11-13 21:03     ` Rafael J. Wysocki
2025-10-12 13:25 ` [PATCH 2/2] treewide: Drop pci_save_state() after pci_restore_state() Lukas Wunner
2025-11-05 14:22   ` Dave Jiang
2025-11-05 14:33   ` Giovanni Cabiddu
2025-11-24 23:13   ` Bjorn Helgaas
2025-11-14 23:45 ` [PATCH 0/2] PCI: Universal error recoverability of devices Bjorn Helgaas
2025-11-19  9:26   ` Lukas Wunner
2025-11-24 21:42     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRWnAd-PZuHMqBwd@wunner.de \
    --to=lukas@wunner.de \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=alek.du@intel.com \
    --cc=alifm@linux.ibm.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=bblock@linux.ibm.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=giovanni.cabiddu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=jirislaby@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=oohall@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=qat-linux@intel.com \
    --cc=rafael@kernel.org \
    --cc=riana.tauro@intel.com \
    --cc=schnelle@linux.ibm.com \
    --cc=sean.c.dardis@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.