public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver O'Halloran <oohall@gmail.com>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>
Subject: Re: [RFC] PCI/AER: Block runtime suspend when handling errors
Date: Thu, 8 Feb 2024 14:49:47 +0100	[thread overview]
Message-ID: <ZcTb+32aciVfGKxH@linux.intel.com> (raw)
In-Reply-To: <ZcJSQFF6XCgPjwx/@linux.intel.com>

On Tue, Feb 06, 2024 at 04:37:36PM +0100, Stanislaw Gruszka wrote:
> > > > If this is a real possibility (I mean, device in a low-power state and
> > > > in an error state at the same time), it would be better to call
> > > > __pm_runtime_disable(dev, false) instead of pm_runtime_get_sync(), as
> > > > that would prevent runtime PM from changing the device state.
> > >
> > > __pm_runtime_disable(dev, false) does not work as reliable in my
> > > test as pm_runtime_get_sync(), the
> > >
> > > igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
> > >
> > > message disappears, but sill have:
> > >
> > > igc 0000:02:00.0: not ready 65535ms after bus reset; giving up
> > > pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25)
> > > pcieport 0000:00:1c.2: AER: subordinate device reset failed
> > > pcieport 0000:00:1c.2: AER: device recovery fail
> > 
> > But what exactly do you do?
> > 
> > (1) __pm_runtime_disable(dev, false)
> > (2) Check power state
> >      (a) If D0 (and device runtime-active), proceed
> >      (b) If > D0, remove power (if possible) and put into D0
> > 
> > or something else?
> 
> I just did point (1), did not check power state (2).
> I tested below patch with replaced:
> 
>   pm_runtime_get_sync -> __pm_runtime_disable(false)
>   pm_runtime_put -> pm_runtime_enable()
> 
> I could try to test with (1)+(2), but now sure how do do step (2b),
> by:
> 
> pci_set_power_state(D3cold)
> pci_set_power_state(D0)

The problematic case is indeed when after __pm_runtime_disable(), device
is in D3hot state. In such case we need to do the same operations as
pci_pm_runtime_resume() does, otherwise AER code is not able to work.
I think, just doing pm_runtime_get_sync() is better.

While I'm able to reproduce D3hot & error state using aer_inject
on the same smae device, more practical case is recovery running on all
devices connected to a port (else case in pcie_do_recovery 
type == PCIE_EXP_TYPE* check). On such case some devices can be suspend,
and from AER code comments I conclude they have to be reset.

> +static int pci_pm_runtime_put(struct pci_dev *pdev, void *data)
> +{
> +	pm_runtime_put(&pdev->dev);
> +	return 0;
> +}
> +
<snip>
> +
> +	pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);

This can happen after driver is unbind from device. I had concern about
that, but after drivers/base/ code examination, seems to be fine do
do pm_runtime_put() after unbind.

Regards
Stanislaw

      reply	other threads:[~2024-02-08 13:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23  9:00 [RFC] PCI/AER: Block runtime suspend when handling errors Stanislaw Gruszka
2024-01-30  0:14 ` Bjorn Helgaas
2024-02-01  9:36   ` Stanislaw Gruszka
2024-02-01 14:10     ` Rafael J. Wysocki
2024-02-05 19:19       ` Stanislaw Gruszka
2024-02-05 21:00         ` Rafael J. Wysocki
2024-02-06 15:37           ` Stanislaw Gruszka
2024-02-08 13:49             ` Stanislaw Gruszka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcTb+32aciVfGKxH@linux.intel.com \
    --to=stanislaw.gruszka@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mahesh@linux.ibm.com \
    --cc=oohall@gmail.com \
    --cc=rafael@kernel.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox