From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Oliver O'Halloran <oohall@gmail.com>,
Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@linux.intel.com>
Subject: Re: [RFC] PCI/AER: Block runtime suspend when handling errors
Date: Thu, 8 Feb 2024 14:49:47 +0100 [thread overview]
Message-ID: <ZcTb+32aciVfGKxH@linux.intel.com> (raw)
In-Reply-To: <ZcJSQFF6XCgPjwx/@linux.intel.com>
On Tue, Feb 06, 2024 at 04:37:36PM +0100, Stanislaw Gruszka wrote:
> > > > If this is a real possibility (I mean, device in a low-power state and
> > > > in an error state at the same time), it would be better to call
> > > > __pm_runtime_disable(dev, false) instead of pm_runtime_get_sync(), as
> > > > that would prevent runtime PM from changing the device state.
> > >
> > > __pm_runtime_disable(dev, false) does not work as reliable in my
> > > test as pm_runtime_get_sync(), the
> > >
> > > igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
> > >
> > > message disappears, but sill have:
> > >
> > > igc 0000:02:00.0: not ready 65535ms after bus reset; giving up
> > > pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25)
> > > pcieport 0000:00:1c.2: AER: subordinate device reset failed
> > > pcieport 0000:00:1c.2: AER: device recovery fail
> >
> > But what exactly do you do?
> >
> > (1) __pm_runtime_disable(dev, false)
> > (2) Check power state
> > (a) If D0 (and device runtime-active), proceed
> > (b) If > D0, remove power (if possible) and put into D0
> >
> > or something else?
>
> I just did point (1), did not check power state (2).
> I tested below patch with replaced:
>
> pm_runtime_get_sync -> __pm_runtime_disable(false)
> pm_runtime_put -> pm_runtime_enable()
>
> I could try to test with (1)+(2), but now sure how do do step (2b),
> by:
>
> pci_set_power_state(D3cold)
> pci_set_power_state(D0)
The problematic case is indeed when after __pm_runtime_disable(), device
is in D3hot state. In such case we need to do the same operations as
pci_pm_runtime_resume() does, otherwise AER code is not able to work.
I think, just doing pm_runtime_get_sync() is better.
While I'm able to reproduce D3hot & error state using aer_inject
on the same smae device, more practical case is recovery running on all
devices connected to a port (else case in pcie_do_recovery
type == PCIE_EXP_TYPE* check). On such case some devices can be suspend,
and from AER code comments I conclude they have to be reset.
> +static int pci_pm_runtime_put(struct pci_dev *pdev, void *data)
> +{
> + pm_runtime_put(&pdev->dev);
> + return 0;
> +}
> +
<snip>
> +
> + pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);
This can happen after driver is unbind from device. I had concern about
that, but after drivers/base/ code examination, seems to be fine do
do pm_runtime_put() after unbind.
Regards
Stanislaw
prev parent reply other threads:[~2024-02-08 13:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-23 9:00 [RFC] PCI/AER: Block runtime suspend when handling errors Stanislaw Gruszka
2024-01-30 0:14 ` Bjorn Helgaas
2024-02-01 9:36 ` Stanislaw Gruszka
2024-02-01 14:10 ` Rafael J. Wysocki
2024-02-05 19:19 ` Stanislaw Gruszka
2024-02-05 21:00 ` Rafael J. Wysocki
2024-02-06 15:37 ` Stanislaw Gruszka
2024-02-08 13:49 ` Stanislaw Gruszka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZcTb+32aciVfGKxH@linux.intel.com \
--to=stanislaw.gruszka@linux.intel.com \
--cc=helgaas@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=rafael@kernel.org \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.