Re: [PATCH v4] PCI: Prevent power state transition of erroneous device

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Raag Jadav <raag.jadav@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Mario Limonciello <superm1@kernel.org>,
	 Denis Benato <benato.denis96@gmail.com>,
	mahesh@linux.ibm.com, oohall@gmail.com,  bhelgaas@google.com,
	linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
	 linux-kernel@vger.kernel.org, ilpo.jarvinen@linux.intel.com,
	lukas@wunner.de,  aravind.iddamsetty@linux.intel.com,
	 "amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Alex Deucher <alexander.deucher@amd.com>
Subject: Re: [PATCH v4] PCI: Prevent power state transition of erroneous device
Date: Fri, 30 May 2025 19:49:26 +0200	[thread overview]
Message-ID: <CAJZ5v0gjA2B4AnaYpfYpaNDo49k4LM2FGSrPFFuOCJ62bCMmkA@mail.gmail.com> (raw)
In-Reply-To: <aDnpfKvLwRZsKxhH@black.fi.intel.com>

On Fri, May 30, 2025 at 7:23 PM Raag Jadav <raag.jadav@intel.com> wrote:
>
> On Fri, May 23, 2025 at 05:23:10PM +0200, Rafael J. Wysocki wrote:
> > On Wed, May 21, 2025 at 1:27 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > On Wed, May 21, 2025 at 10:54 AM Raag Jadav <raag.jadav@intel.com> wrote:
> > > > On Tue, May 20, 2025 at 01:56:28PM -0500, Mario Limonciello wrote:
> > > > > On 5/20/2025 1:42 PM, Raag Jadav wrote:
> > > > > > On Tue, May 20, 2025 at 12:39:12PM -0500, Mario Limonciello wrote:
> > > > > > > On 5/20/2025 12:22 PM, Denis Benato wrote:
> > > > > > > > On 5/20/25 17:49, Mario Limonciello wrote:
> > > > > > > > > On 5/20/2025 10:47 AM, Raag Jadav wrote:
> > > > > > > > > > On Tue, May 20, 2025 at 10:23:57AM -0500, Mario Limonciello wrote:
> > > > > > > > > > > On 5/20/2025 4:48 AM, Raag Jadav wrote:
> > > > > > > > > > > > On Mon, May 19, 2025 at 11:42:31PM +0200, Denis Benato wrote:
> > > > > > > > > > > > > On 5/19/25 12:41, Raag Jadav wrote:
> > > > > > > > > > > > > > On Mon, May 19, 2025 at 03:58:08PM +0530, Raag Jadav wrote:
> > > > > > > > > > > > > > > If error status is set on an AER capable device, most likely either the
> > > > > > > > > > > > > > > device recovery is in progress or has already failed. Neither of the
> > > > > > > > > > > > > > > cases are well suited for power state transition of the device, since
> > > > > > > > > > > > > > > this can lead to unpredictable consequences like resume failure, or in
> > > > > > > > > > > > > > > worst case the device is lost because of it. Leave the device in its
> > > > > > > > > > > > > > > existing power state to avoid such issues.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > v2: Synchronize AER handling with PCI PM (Rafael)
> > > > > > > > > > > > > > > v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
> > > > > > > > > > > > > > >         Elaborate "why" (Bjorn)
> > > > > > > > > > > > > > > v4: Rely on error status instead of device status
> > > > > > > > > > > > > > >         Condense comment (Lukas)
> > > > > > > > > > > > > > Since pci_aer_in_progress() is changed I've not included Rafael's tag with
> > > > > > > > > > > > > > my understanding of this needing a revisit. If this was a mistake, please
> > > > > > > > > > > > > > let me know.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Denis, Mario, does this fix your issue?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Hello,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Unfortunately no, I have prepared a dmesg but had to remove the bootup process because it was too long of a few kb: https://pastebin.com/1uBEA1FL
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the test. It seems there's no hotplug event this time around
> > > > > > > > > > > > and endpoint device is still intact without any PCI related failure.
> > > > > > > > > > > >
> > > > > > > > > > > > Also,
> > > > > > > > > > > >
> > > > > > > > > > > > amdgpu 0000:09:00.0: PCI PM: Suspend power state: D3hot
> > > > > > > > > > > >
> > > > > > > > > > > > Which means whatever you're facing is either not related to this patch,
> > > > > > > > > > > > or at best exposed some nasty side-effect that's not handled correctly
> > > > > > > > > > > > by the driver.
> > > > > > > > > > > >
> > > > > > > > > > > > I'd say amdgpu folks would be of better help for your case.
> > > > > > > > > > > >
> > > > > > > > > > > > Raag
> > > > > > > > > > >
> > > > > > > > > > > So according to the logs Denis shared with v4
> > > > > > > > > > > (https://pastebin.com/1uBEA1FL) the GPU should have been going to BOCO. This
> > > > > > > > > > > stands for "Bus off Chip Off"
> > > > > > > > > > >
> > > > > > > > > > > amdgpu 0000:09:00.0: amdgpu: Using BOCO for runtime pm
> > > > > > > > > > >
> > > > > > > > > > > If it's going to D3hot - that's not going to be BOCO, it should be going to
> > > > > > > > > > > D3cold.
> > > > > > > > > >
> > > > > > > > > > Yes, because upstream port is in D0 for some reason (might be this patch
> > > > > > > > > > but not sure) and so will be the root port.
> > > > > > > > > >
> > > > > > > > > > pcieport 0000:07:00.0: PCI PM: Suspend power state: D0
> > > > > > > > > > pcieport 0000:07:00.0: PCI PM: Skipped
> > > > > > > > > >
> > > > > > > > > > and my best guess is the driver is not able to cope with the lack of D3cold.
> > > > > > > > >
> > > > > > > > > Yes; if the driver is configured to expect BOCO (D3cold) if it doesn't get it, chaos ensues.
> > > > > > > > >
> > > > > > > > > I guess let's double check the behavior with CONFIG_PCI_DEBUG to verify this patch is what is changing that upstream port behavior.
> > > > > > > >
> > > > > > > >
> > > > > > > > This is the very same exact kernel, minus the patch in question:  https://pastebin.com/rwMYgG7C
> > > > > > > >
> > > > > > > >
> > > > > > > > Both previous kernel and this one have CONFIG_PCI_DEBUG=y.
> > > > > > > >
> > > > > > > > Removed the initial bootup sequence to be able to use pastebin.
> > > > > > >
> > > > > > > Thanks - this confirms that the problem is the root port not going to D3.
> > > > > > > This new log shows:
> > > > > > >
> > > > > > > pcieport 0000:07:00.0: PCI PM: Suspend power state: D3hot
> > > > > > >
> > > > > > > So I feel we should fixate on solving that.
> > > > > >
> > > > > > Which means what you're looking for is error flag being set somewhere in
> > > > > > the hierarchy that is preventing suspend.
> > > > >
> > > > > Is the issue perhaps that this is now gated on both correctable and
> > > > > uncorrectable errors?
> > > > >
> > > > > Perhaps should *correctable errors* be emitted with a warning and the
> > > > > *uncorrectable errors* be fatal?
> > > >
> > > > That'd be more or less inline with hiding the issue, and it can also race
> > > > with err_handler callback if driver has registered it.
> > > >
> > > > > > But regardless of it, my understanding is that root port suspend depends
> > > > > > on a lot of factors (now errors flags being one of them with this patch)
> > > > > > and endpoint driver can't possibly enforce or guarantee it - the best it
> > > > > > can do is try.
> > > > > >
> > > > > > What's probably needed is D3cold failure handling on driver side, but I'm
> > > > > > no PCI PM expert and perhaps Rafael can comment on it.
> > > > > >
> > > > > > Raag
> > > > >
> > > > > From the driver perspective it does have expectations that the parts outside
> > > > > the driver did the right thing.  If the driver was expecting the root port
> > > > > to be powered down at suspend and it wasn't there are hardware components
> > > > > that didn't power cycle and that's what we're seeing here.
> > > >
> > > > Which means the expectation set by the driver is the opposite of the
> > > > purpose of this patch, and it's going to fail if any kind of error is
> > > > detected under root port during suspend.
> > >
> > > And IMV this driver's expectation is questionable at least.
> > >
> > > There is no promise whatsoever that the device will always be put into
> > > D3cold during system suspend.
> >
> > For instance, user space may disable D3cold for any PCI device via the
> > d3cold_allowed attribute in sysfs.
> >
> > If the driver cannot handle this, it needs to be fixed.
>
> Thanks for confirming. So should we consider this patch to be valid
> and worth moving forward?

It doesn't do anything that would be invalid in principle IMV.

You need to consider one more thing, though: It may be necessary to
power-cycle the device in order to kick it out of the erroneous state
and the patch effectively blocks this if I'm not mistaken.

But admittedly I'm not sure if this really matters.

next prev parent reply	other threads:[~2025-05-30 17:49 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-19 10:28 [PATCH v4] PCI: Prevent power state transition of erroneous device Raag Jadav
2025-05-19 10:41 ` Raag Jadav
2025-05-19 21:42   ` Denis Benato
2025-05-20  9:48     ` Raag Jadav
2025-05-20 15:23       ` Mario Limonciello
2025-05-20 15:47         ` Raag Jadav
2025-05-20 15:49           ` Mario Limonciello
2025-05-20 17:22             ` Denis Benato
2025-05-20 17:39               ` Mario Limonciello
2025-05-20 18:42                 ` Raag Jadav
2025-05-20 18:56                   ` Mario Limonciello
2025-05-21  8:54                     ` Raag Jadav
2025-05-21 11:27                       ` Rafael J. Wysocki
2025-05-23 15:23                         ` Rafael J. Wysocki
2025-05-30 17:23                           ` Raag Jadav
2025-05-30 17:49                             ` Rafael J. Wysocki [this message]
2025-06-04 15:42                               ` Raag Jadav
2025-06-04 18:19                                 ` Rafael J. Wysocki
2025-06-05 11:44                                   ` Raag Jadav
2025-06-05 12:26                                     ` Rafael J. Wysocki
2025-06-10 13:44                                       ` Raag Jadav
2025-06-10 13:53                                         ` Rafael J. Wysocki
2025-06-20 12:14                                           ` Raag Jadav
2025-05-21 13:39               ` Lukas Wunner
2025-05-21 17:06                 ` Mario Limonciello
2025-05-21 20:28                   ` Denis Benato
2025-05-22  7:31                     ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0gjA2B4AnaYpfYpaNDo49k4LM2FGSrPFFuOCJ62bCMmkA@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=benato.denis96@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=oohall@gmail.com \
    --cc=raag.jadav@intel.com \
    --cc=superm1@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).