linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Mario Limonciello <superm1@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	"open list:PCI SUBSYSTEM" <linux-pci@vger.kernel.org>,
	linux-pm@vger.kernel.org,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Mario Limonciello <mario.limonciello@amd.com>
Subject: Re: [PATCH v3 2/2] PCI: Fix runtime PM usage count underflow on device unplug
Date: Sun, 22 Jun 2025 06:43:31 +0200	[thread overview]
Message-ID: <aFeJ83O9PRUrM2Ir@wunner.de> (raw)
In-Reply-To: <8d4d98b6-fec5-466f-bd2c-059d702c7860@kernel.org>

On Sat, Jun 21, 2025 at 02:56:08PM -0500, Mario Limonciello wrote:
> On 6/21/25 2:05 PM, Lukas Wunner wrote:
> > In the dmesg output attached to...
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=220216
> > 
> > ... the device exhibiting the refcount underflow is a PCIe port.
> > Are you also seeing this on a PCIe port or is it a different device?
> 
> The device with the underflow is the disconnected PCIe bridge.
> 
> In my case it was this bridge that was downstream.

Okay, in both cases the refcount underflow occurs on a PCIe port.
So it seems likely the gratuitous refcount decrement is in portdrv.c
or one of the port services drivers.

Your patch changes the code path for *all* PCI devices.
Not just PCIe ports.  Hence it's likely not the right fix.

It may fix the issue on this particular PCIe port but
I strongly suspect it'll leak a runtime PM ref on all other devices.


> > So the refcount decrement happens in pcie_portdrv_probe() and
> > the refcount increment happens in pcie_portdrv_remove().
> > Both times it's conditional on pci_bridge_d3_possible().
> > Does that return a different value on probe versus remove?

Could you please answer this?


> > Does any of the port service drivers decrement the refcount
> > once too often?  I've just looked through pciehp but cannot
> > find anything out of the ordinary.
> > 
> > Looking through recent changes, 002bf2fbc00e and bca84a7b93fd
> > look like potential candidates causing a regression, but the
> > former is for AER (which isn't used in the dmesg attached to
> > the bugzilla) and the latter touches suspend on system sleep,
> > not runtime suspend.
> > 
> > Can you maybe instrument the pm_runtime_{get,put}*() functions
> > with a printk() and/or dump_stack() to see where a gratuitous
> > refcount decrement occurs?
> 
> That's exactly what I did to conclude this call was an extra one.
> 
> Here's the drop to 0:

The drop to 0 is uninteresting.  You need to record *all*
refcount increments/decrements so that we can see where the
gratuitous one occurs.  It happens earlier than the drop to 0.

However, please first check whether the pci_bridge_d3_possible()
return value changes on probe versus remove of the PCIe port
in question.  If it does, then that's the root cause and there's
no need to look any further.

Thanks,

Lukas

  reply	other threads:[~2025-06-22  4:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20  2:55 [PATCH v3 0/2] Don't make noise about disconnected USB4 devices Mario Limonciello
2025-06-20  2:55 ` [PATCH v3 1/2] PCI/PM: Skip resuming to D0 if disconnected Mario Limonciello
2025-06-23 17:48   ` Lukas Wunner
2025-06-20  2:55 ` [PATCH v3 2/2] PCI: Fix runtime PM usage count underflow on device unplug Mario Limonciello
2025-06-21 19:05   ` Lukas Wunner
2025-06-21 19:56     ` Mario Limonciello
2025-06-22  4:43       ` Lukas Wunner [this message]
2025-06-22 18:39         ` Mario Limonciello
2025-06-23  1:47           ` Mario Limonciello
2025-06-23  6:53             ` Lukas Wunner
2025-06-23  6:43           ` Lukas Wunner
2025-06-23  7:37             ` Lukas Wunner
2025-06-23 10:05               ` Lukas Wunner
2025-06-23 10:11                 ` Rafael J. Wysocki
2025-06-23 11:37                   ` Mario Limonciello
2025-06-23 12:19                     ` Lukas Wunner
2025-06-23 12:45                       ` Mario Limonciello
2025-06-23 17:23                     ` Lukas Wunner
2025-06-23 17:25                       ` Mario Limonciello

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFeJ83O9PRUrM2Ir@wunner.de \
    --to=lukas@wunner.de \
    --cc=bhelgaas@google.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=rjw@rjwysocki.net \
    --cc=superm1@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).