From: Peter Wu <peter@lekensteyn.nl>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
Lukas Wunner <lukas@wunner.de>,
linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
Valdis Kletnieks <valdis.kletnieks@vt.edu>,
Dave Airlie <airlied@gmail.com>
Subject: Re: [PATCH] PCI: Power on bridges before scanning new devices
Date: Wed, 25 May 2016 01:46:08 +0200 [thread overview]
Message-ID: <20160524234608.GA1093@al> (raw)
In-Reply-To: <20160524163848.GB30762@localhost>
On Tue, May 24, 2016 at 11:38:48AM -0500, Bjorn Helgaas wrote:
> On Tue, May 24, 2016 at 04:27:44PM +0200, Peter Wu wrote:
> > On Tue, May 24, 2016 at 03:53:23PM +0300, Mika Westerberg wrote:
> > > On Tue, May 24, 2016 at 07:23:57AM -0500, Bjorn Helgaas wrote:
> > > > On Mon, May 23, 2016 at 04:50:15PM -0500, Bjorn Helgaas wrote:
> > > > > [+cc Valdis, Dave]
> > > > >
> > > > > On Mon, May 23, 2016 at 03:00:42PM -0500, Bjorn Helgaas wrote:
> > > > > > On Mon, May 23, 2016 at 11:20:48AM +0300, Mika Westerberg wrote:
> > > > > > > When a PCI device is removed through sysfs interface the upstream bridge
> > > > > > > (PCIe port) can be runtime suspended if it was the last device on that bus.
> > > > > > > Now, if the bridge is in D3 we cannot find devices below the bridge
> > > > > > > anymore. For example following fails to find the removed device again:
> > > > > > >
> > > > > > > # echo 1 > /sys/bus/pci/devices/0000:00:01.0/0000:01:00.0/remove
> > > > > > > # echo 1 > /sys/bus/pci/devices/0000:00:01.0/rescan
> > > > > > >
> > > > > > > Where 0000:00:01.0 is the bridge device.
> > > > > > >
> > > > > > > In order to be able to rescan devices below the bridge add
> > > > > > > pm_runtime_get_sync()/pm_runtime_put() calls to pci_scan_bridge(). This
> > > > > > > should keep bridges powered on while their children devices are being
> > > > > > > scanned.
> > > > > > >
> > > > > > > Reported-by: Peter Wu <peter@lekensteyn.nl>
> > > > > > > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > > > >
> > > > > > This looks like basically the same idea as "ACPI / hotplug / PCI:
> > > > > > Runtime resume bridge before rescan".
> > > > > >
> > > > > > The hotplug_event() path modified by that patch eventually calls
> > > > > > pci_scan_bridge():
> > > > > >
> > > > > > hotplug_event
> > > > > > enable_slot
> > > > > > pci_scan_bridge
> > > > > >
> > > > > > so this patch looks a little more general. Does it make "ACPI /
> > > > > > hotplug / PCI: Runtime resume bridge before rescan" unnecessary?
> > > > > > Can I just replace that patch with this one?
> > > > >
> > > > > I speculatively replaced "ACPI / hotplug / PCI: Runtime resume bridge
> > > > > before rescan" with this one and pushed the result to
> > > > >
> > > > > https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/pm
> > > > >
> > > > > Please take a look, test it, and let me know if I need to add the ACPI
> > > > > patch back.
> > > > >
> > > > > This branch also includes the fix for the lockdep splat reported by
> > > > > Valdis. This is what I hope to get into v4.7-rc1.
> > > >
> > > > Ping? I'd like to ask Linus to pull this pci/pm branch before v4.7-rc1.
> > > > It currently has these changes:
> > > >
> > > > 8b71f5652eea PCI: Add runtime PM support for PCIe ports
> > > > af81f0fa638b PCI: Power on bridges before scanning new devices
> > > > 9741a01c9f55 PCI: Put PCIe ports into D3 during suspend
> > > > b3a63ff7baf1 PCI: Don't clear d3cold_allowed for PCIe ports
> > >
> > > Looks good to me. I've also tested those here and seems to work fine.
> >
> > I have tested these patches for some time now on top of v4.6 (just
> > dropped the ACPI hotplug patch and re-tested just to be sure) and it
> > works for nouveau, but only if that one is patched to avoid calling the
> > device-specific Optimus method.
>
> I assume this nouveau issue is a driver problem unrelated to the
> pci/pm changes, and it wouldn't make any difference if I included the
> ACPI hotplug patch. Right?
This is a nouveau-specific driver problem that is triggered by the
adding runtime PM support for PCIe ports.
> > Without that patch (WIP below, I plan to
> > rebase it on a refactoring patch), the nvidia card stays disabled even
> > after the bridge returns into D0:
> >
> > nouveau 0000:01:00.0: power state changed by ACPI to D0
> > nouveau 0000:01:00.0: Refused to change power state, currently in D3
> > nouveau 0000:01:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
> > nouveau 0000:01:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
> > nouveau 0000:01:00.0: restoring config space at offset 0x34 (was 0xffffffff, writing 0x60)
> > nouveau 0000:01:00.0: restoring config space at offset 0x30 (was 0xffffffff, writing 0xdf000000)
> > ...
> > INFO: rcu_sched self-detected stall on CPU
> >
> > A workaround (until nouveau is fixed) is to disable runtime PM on the
> > bridge (or on nouveau with nouveau.runpm=0):
> >
> > # echo on > /sys/bus/pci/devices/0000:00:01.0/power/control
> >
> > This was tested on a Clevo P651RA laptop (with acpi_osi="!Windows 2013",
> > there is a weird PCIe PM issue for which I will fill a report later).
>
> This seems like a pretty significant problem: if we enable runtime PM
> on bridges by default, and Nvidia cards stay disabled when the bridge
> returns to D0, that sounds like a bad regression. I don't think I
> could ask Linus to pull these changes with a known issue like that.
> Or am I misunderstanding something?
That would indeed be a regression, please see this nouveau patch series
for a suggestion:
https://lists.freedesktop.org/archives/nouveau/2016-May/025116.html
By the way, there is also currently a RPM refcounting issue in nouveau
(possibly fixed by
https://lists.freedesktop.org/archives/nouveau/2016-May/025107.html)
which resulted in the following problem:
1. Initially RPM refcount is 1 when no driver is loaded for a device.
2. Load nouveau which will RPM suspend the device. refcount becomes 0.
3. Unload nouveau (bug: refcount stays 0).
4. As a result, the PCIe ports are runtime suspended (should probably
not happen as there is no bound driver).
5. When nouveau is loaded again, the power resources are enabled again,
*but* it does not put the device into D0 (it expected that this was
already the case). (with ACPI debugging on I see that _PS0 is not
called).
6. As a result, reading registers failed, "unknown chipset (ffffffff)"
is printed to dmesg and the probe fails.
(Maybe the PCI core should have a WARN when the RPM is "auto" and the
refcount drops below 1 after remove is called with no driver left?)
Kind regards,
Peter Wu
> > ---
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > index cdf5227..531d6be 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > @@ -45,6 +45,7 @@
> > static struct nouveau_dsm_priv {
> > bool dsm_detected;
> > bool optimus_detected;
> > + bool use_pr3;
> > acpi_handle dhandle;
> > acpi_handle rom_handle;
> > } nouveau_dsm_priv;
> > @@ -246,6 +247,28 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
> > return retval;
> > }
> >
> > +/* Windows 8/8.1/10 do not use DSM to put the device in D3cold state,
> > + * instead it disables power resources on the parent PCIe port device. */
> > +static bool nouveau_check_pr3(struct pci_dev *dis_dev)
> > +{
> > + acpi_handle parent_handle;
> > + struct acpi_device *ad = NULL;
> > +
> > + if (ACPI_FAILURE(acpi_get_parent(nouveau_dsm_priv.dhandle,
> > + &parent_handle))) {
> > + pr_warn("Failed to obtain the parent device\n");
> > + return false;
> > + }
> > +
> > + acpi_bus_get_device(parent_handle, &ad);
> > + if (!ad) {
> > + pr_warn("Failed to obtain an ACPI device for handle\n");
> > + return false;
> > + }
> > +
> > + return ad->power.flags.power_resources;
> > +}
> > +
> > static bool nouveau_dsm_detect(void)
> > {
> > char acpi_method_name[255] = { 0 };
> > @@ -253,6 +276,7 @@ static bool nouveau_dsm_detect(void)
> > struct pci_dev *pdev = NULL;
> > int has_dsm = 0;
> > int has_optimus = 0;
> > + int has_pr3 = 0;
> > int vga_count = 0;
> > bool guid_valid;
> > int retval;
> > @@ -271,8 +295,10 @@ static bool nouveau_dsm_detect(void)
> > retval = nouveau_dsm_pci_probe(pdev);
> > if (retval & NOUVEAU_DSM_HAS_MUX)
> > has_dsm |= 1;
> > - if (retval & NOUVEAU_DSM_HAS_OPT)
> > + if (retval & NOUVEAU_DSM_HAS_OPT) {
> > has_optimus = 1;
> > + has_pr3 |= nouveau_check_pr3(pdev);
> > + }
> > }
> >
> > while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) {
> > @@ -281,8 +307,10 @@ static bool nouveau_dsm_detect(void)
> > retval = nouveau_dsm_pci_probe(pdev);
> > if (retval & NOUVEAU_DSM_HAS_MUX)
> > has_dsm |= 1;
> > - if (retval & NOUVEAU_DSM_HAS_OPT)
> > + if (retval & NOUVEAU_DSM_HAS_OPT) {
> > has_optimus = 1;
> > + has_pr3 |= nouveau_check_pr3(pdev);
> > + }
> > }
> >
> > /* find the optimus DSM or the old v1 DSM */
> > @@ -292,6 +320,10 @@ static bool nouveau_dsm_detect(void)
> > printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method %s handle\n",
> > acpi_method_name);
> > nouveau_dsm_priv.optimus_detected = true;
> > + if (has_pr3) {
> > + pr_info("detected PR3 support\n");
> > + nouveau_dsm_priv.use_pr3 = 1;
> > + }
> > ret = true;
> > } else if (vga_count == 2 && has_dsm && guid_valid) {
> > acpi_get_name(nouveau_dsm_priv.dhandle, ACPI_FULL_PATHNAME,
> > @@ -321,7 +353,7 @@ void nouveau_register_dsm_handler(void)
> > void nouveau_switcheroo_optimus_dsm(void)
> > {
> > u32 result = 0;
> > - if (!nouveau_dsm_priv.optimus_detected)
> > + if (!nouveau_dsm_priv.optimus_detected || nouveau_dsm_priv.use_pr3)
> > return;
> >
> > nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_FLAGS,
next prev parent reply other threads:[~2016-05-24 23:46 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 17:14 Rescanning is broken with runtime PM for PCIe ports Peter Wu
2016-05-19 7:42 ` Mika Westerberg
2016-05-19 11:36 ` Mika Westerberg
2016-05-20 8:45 ` Peter Wu
2016-05-23 8:20 ` [PATCH] PCI: Power on bridges before scanning new devices Mika Westerberg
2016-05-23 20:00 ` Bjorn Helgaas
2016-05-23 21:50 ` Bjorn Helgaas
2016-05-24 12:23 ` Bjorn Helgaas
2016-05-24 12:52 ` Lukas Wunner
2016-05-24 12:53 ` Mika Westerberg
2016-05-24 14:27 ` Peter Wu
2016-05-24 15:06 ` Lukas Wunner
2016-05-24 16:38 ` Bjorn Helgaas
2016-05-24 23:46 ` Peter Wu [this message]
2016-05-24 16:28 ` Bjorn Helgaas
2016-05-25 15:04 ` [PATCH] PCI: Wait for 50ms after bridge is powered up Mika Westerberg
2016-05-25 20:44 ` Rafael J. Wysocki
2016-05-26 10:10 ` Lukas Wunner
2016-05-26 10:25 ` Mika Westerberg
2016-05-26 10:45 ` Lukas Wunner
2016-05-26 11:03 ` Mika Westerberg
2016-05-28 12:29 ` Rafael J. Wysocki
2016-05-30 9:33 ` Mika Westerberg
2016-05-30 14:44 ` Mika Westerberg
2016-05-30 15:19 ` Andreas Noever
2016-05-31 8:33 ` Mika Westerberg
2016-05-31 8:58 ` Mika Westerberg
2016-05-31 10:40 ` Lukas Wunner
2016-05-31 10:47 ` Mika Westerberg
2016-05-31 11:07 ` Lukas Wunner
2016-06-01 9:11 ` Mika Westerberg
2016-06-01 11:42 ` Lukas Wunner
2016-05-24 21:13 ` [PATCH] PCI: Power on bridges before scanning new devices Mika Westerberg
2016-05-25 0:03 ` Rafael J. Wysocki
2016-05-25 13:19 ` Mika Westerberg
2016-05-25 20:45 ` Rafael J. Wysocki
2016-05-26 8:16 ` Mika Westerberg
2016-05-28 12:21 ` Rafael J. Wysocki
2016-05-30 9:35 ` Mika Westerberg
2016-05-25 12:16 ` Lukas Wunner
2016-05-25 13:25 ` Mika Westerberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160524234608.GA1093@al \
--to=peter@lekensteyn.nl \
--cc=airlied@gmail.com \
--cc=bhelgaas@google.com \
--cc=helgaas@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mika.westerberg@linux.intel.com \
--cc=rjw@rjwysocki.net \
--cc=valdis.kletnieks@vt.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).