All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver O'Halloran <oohall@gmail.com>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>
Subject: Re: [RFC] PCI/AER: Block runtime suspend when handling errors
Date: Tue, 6 Feb 2024 16:37:36 +0100	[thread overview]
Message-ID: <ZcJSQFF6XCgPjwx/@linux.intel.com> (raw)
In-Reply-To: <CAJZ5v0hLXS1EJZgUPn_i6Sp1RNJ2tH1oJ6AKvAQAM4Um_bwHPA@mail.gmail.com>

On Mon, Feb 05, 2024 at 10:00:45PM +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 5, 2024 at 9:11 PM Stanislaw Gruszka
> <stanislaw.gruszka@linux.intel.com> wrote:
> >
> > On Thu, Feb 01, 2024 at 03:10:35PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Feb 1, 2024 at 10:36 AM Stanislaw Gruszka
> > > <stanislaw.gruszka@linux.intel.com> wrote:
> > > >
> > > > On Mon, Jan 29, 2024 at 06:14:36PM -0600, Bjorn Helgaas wrote:
> > > > > On Tue, Jan 23, 2024 at 10:00:16AM +0100, Stanislaw Gruszka wrote:
> > > > > > PM runtime can be done simultaneously with AER error handling.
> > > > > > Avoid that by using pm_runtime_get_sync() just after pci_dev_get()
> > > > > > and pm_runtime_put() just before pci_dev_put() in AER recovery
> > > > > > procedures.
> > > > >
> > > > > I guess there must be a general rule here, like "PCI core must use
> > > > > pm_runtime_get_sync() whenever touching the device asynchronously,
> > > > > i.e., when it's doing something unrelated to a call from the driver"?
> > > >
> > > > Clear rules would be nice, that's for sure.
> > > >
> > > > > Probably would apply to all subsystem cores, not just PCI.
> > > >
> > > > If they have something similar like AER.
> > > >
> > > > > > I'm not sure about DPC case since I do not see get/put there. It
> > > > > > just call pci_do_recovery() from threaded irq dcd_handler().
> > > > > > I think pm_runtime* should be added to this handler as well.
> > > > >
> > > > > s/dcd_handler/dpc_handler/
> > > > >
> > > > > I'm guessing the "threaded" part really doesn't matter; just the fact
> > > > > that this is in response to an interrupt, not something directly
> > > > > called by a driver?
> > > >
> > > > Yes. I added "threaded" just to emphasis that it's safe to add sleeping
> > > > functions there. In context of possible solution, not related to
> > > > the problem itself.
> > > >
> > > > > > pm_runtime_get_sync() will increase dev->power.usage_count counter to
> > > > > > prevent any rpm actives. When there is suspending pending, it will wait
> > > > > > for it and do the rpm resume. Not sure if that problem, on my testing
> > > > > > I did not encounter issues with that.
> > > > >
> > > > > Sorry, I didn't catch your meaning here.
> > > > I tired to write two things:
> > > >
> > > > First, pm_runtime_get_sync() after exit prevents possibility of
> > > > runtime suspend, so we are sure device will not be powered off
> > > >
> > > > Second, if during pm_runtime_get_sync(), there is pending attempt
> > > > to suspend device, it will be synchronized and device will
> > > > be resumed.
> > >
> > > Not exactly.  pm_runtime_get_sync() will resume the device
> > > synchronously no matter what.
> > >
> > > > This can be problematic as device is in error state.
> > >
> > > If this is a real possibility (I mean, device in a low-power state and
> > > in an error state at the same time), it would be better to call
> > > __pm_runtime_disable(dev, false) instead of pm_runtime_get_sync(), as
> > > that would prevent runtime PM from changing the device state.
> >
> > __pm_runtime_disable(dev, false) does not work as reliable in my
> > test as pm_runtime_get_sync(), the
> >
> > igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
> >
> > message disappears, but sill have:
> >
> > igc 0000:02:00.0: not ready 65535ms after bus reset; giving up
> > pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25)
> > pcieport 0000:00:1c.2: AER: subordinate device reset failed
> > pcieport 0000:00:1c.2: AER: device recovery fail
> 
> But what exactly do you do?
> 
> (1) __pm_runtime_disable(dev, false)
> (2) Check power state
>      (a) If D0 (and device runtime-active), proceed
>      (b) If > D0, remove power (if possible) and put into D0
> 
> or something else?

I just did point (1), did not check power state (2).
I tested below patch with replaced:

  pm_runtime_get_sync -> __pm_runtime_disable(false)
  pm_runtime_put -> pm_runtime_enable()

I could try to test with (1)+(2), but now sure how do do step (2b),
by:

pci_set_power_state(D3cold)
pci_set_power_state(D0)

?

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 59c90d04a609..705893b5f7b0 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -13,6 +13,7 @@
 #define dev_fmt(fmt) "AER: " fmt
 
 #include <linux/pci.h>
+#include <linux/pm_runtime.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/errno.h>
@@ -85,6 +86,18 @@ static int report_error_detected(struct pci_dev *dev,
 	return 0;
 }
 
+static int pci_pm_runtime_get_sync(struct pci_dev *pdev, void *data)
+{
+	pm_runtime_get_sync(&pdev->dev);
+	return 0;
+}
+
+static int pci_pm_runtime_put(struct pci_dev *pdev, void *data)
+{
+	pm_runtime_put(&pdev->dev);
+	return 0;
+}
+
 static int report_frozen_detected(struct pci_dev *dev, void *data)
 {
 	return report_error_detected(dev, pci_channel_io_frozen, data);
@@ -207,6 +220,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	else
 		bridge = pci_upstream_bridge(dev);
 
+	pci_walk_bridge(bridge, pci_pm_runtime_get_sync, NULL);
+
 	pci_dbg(bridge, "broadcast error_detected message\n");
 	if (state == pci_channel_io_frozen) {
 		pci_walk_bridge(bridge, report_frozen_detected, &status);
@@ -251,10 +266,15 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 		pcie_clear_device_status(dev);
 		pci_aer_clear_nonfatal_status(dev);
 	}
+
+	pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);
+
 	pci_info(bridge, "device recovery successful\n");
 	return status;
 
 failed:
+	pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);
+
 	pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT);
 
 	/* TODO: Should kernel panic here? */

  reply	other threads:[~2024-02-06 15:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23  9:00 [RFC] PCI/AER: Block runtime suspend when handling errors Stanislaw Gruszka
2024-01-30  0:14 ` Bjorn Helgaas
2024-02-01  9:36   ` Stanislaw Gruszka
2024-02-01 14:10     ` Rafael J. Wysocki
2024-02-05 19:19       ` Stanislaw Gruszka
2024-02-05 21:00         ` Rafael J. Wysocki
2024-02-06 15:37           ` Stanislaw Gruszka [this message]
2024-02-08 13:49             ` Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcJSQFF6XCgPjwx/@linux.intel.com \
    --to=stanislaw.gruszka@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mahesh@linux.ibm.com \
    --cc=oohall@gmail.com \
    --cc=rafael@kernel.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.