* Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
@ 2021-11-18 20:33 Heiner Kallweit
2021-11-18 21:28 ` Takashi Iwai
0 siblings, 1 reply; 7+ messages in thread
From: Heiner Kallweit @ 2021-11-18 20:33 UTC (permalink / raw)
To: Takashi Iwai; +Cc: alsa-devel, Linux PM
I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
comprehensive PM runtime setup for controller driver"):
snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
Not sure how this patch was tested because the warning is obvious.
The patch doesn't consider what the PCI sub-system does with regard to
RPM. Have a look at pci_pm_init().
I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
but for all other added calls I see no justification.
If being unsure about when to use which RPM call best involve
linux-pm@vger.kernel.org.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" 2021-11-18 20:33 Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" Heiner Kallweit @ 2021-11-18 21:28 ` Takashi Iwai 2021-11-18 22:13 ` Heiner Kallweit 0 siblings, 1 reply; 7+ messages in thread From: Takashi Iwai @ 2021-11-18 21:28 UTC (permalink / raw) To: Heiner Kallweit; +Cc: alsa-devel, Linux PM On Thu, 18 Nov 2021 21:33:34 +0100, Heiner Kallweit wrote: > > I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More > comprehensive PM runtime setup for controller driver"): > > snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! > > Not sure how this patch was tested because the warning is obvious. > The patch doesn't consider what the PCI sub-system does with regard to > RPM. Have a look at pci_pm_init(). > > I'd understand to add the call to pm_runtime_dont_use_autosuspend(), > but for all other added calls I see no justification. > > If being unsure about when to use which RPM call best involve > linux-pm@vger.kernel.org. Thanks for the notice. It's been through Intel CI and tests on a few local machines, maybe we haven't checked carefully those errors but only concentrated on the other issues, as it seems. There were two problems: one was the runtime PM being kicked off even during the PCI driver remove call, and another was the proper runtime PM setup after re-binding. For avoiding the former, only the pm_runtime_forbid() (and maybe pm_runtime_dont_use_autosuspend(), too) would suffice? Also, for PCI device, no need for pm_runtime_set_supended() at remove, right? thanks, Takashi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" 2021-11-18 21:28 ` Takashi Iwai @ 2021-11-18 22:13 ` Heiner Kallweit 2021-11-19 13:51 ` Takashi Iwai 0 siblings, 1 reply; 7+ messages in thread From: Heiner Kallweit @ 2021-11-18 22:13 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel, Linux PM On 18.11.2021 22:28, Takashi Iwai wrote: > On Thu, 18 Nov 2021 21:33:34 +0100, > Heiner Kallweit wrote: >> >> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More >> comprehensive PM runtime setup for controller driver"): >> >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! >> >> Not sure how this patch was tested because the warning is obvious. >> The patch doesn't consider what the PCI sub-system does with regard to >> RPM. Have a look at pci_pm_init(). >> >> I'd understand to add the call to pm_runtime_dont_use_autosuspend(), >> but for all other added calls I see no justification. >> >> If being unsure about when to use which RPM call best involve >> linux-pm@vger.kernel.org. > > Thanks for the notice. It's been through Intel CI and tests on a few > local machines, maybe we haven't checked carefully those errors but > only concentrated on the other issues, as it seems. > > There were two problems: one was the runtime PM being kicked off even > during the PCI driver remove call, and another was the proper runtime > PM setup after re-binding. > Having a look at the commit message of "ALSA: hda: fix general protection fault in azx_runtime_idle" the following sounds weird: - pci-driver.c:pm_runtime_put_sync() leads to a call to rpm_idle() which again calls azx_runtime_idle() rpm_idle() is only called if usage_count is 1 when entering pm_runtime_put_sync. And this should not be the case. pm_runtime_get_sync() increments the usage counter before remove() is called, and remove() should also increment the usage counter. This doesn't seem to happen. Maybe for whatever reason pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() isn't called from remove(). I think you should trace the call chain from the PCI core calling remove() to pm_runtime_get_noresume() getting called or not. > For avoiding the former, only the pm_runtime_forbid() (and maybe > pm_runtime_dont_use_autosuspend(), too) would suffice? Also, for PCI > device, no need for pm_runtime_set_supended() at remove, right? > > > thanks, > > Takashi > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" 2021-11-18 22:13 ` Heiner Kallweit @ 2021-11-19 13:51 ` Takashi Iwai 2021-11-19 15:10 ` Kai Vehmanen 2021-11-19 20:57 ` Heiner Kallweit 0 siblings, 2 replies; 7+ messages in thread From: Takashi Iwai @ 2021-11-19 13:51 UTC (permalink / raw) To: Heiner Kallweit; +Cc: alsa-devel, Linux PM On Thu, 18 Nov 2021 23:13:50 +0100, Heiner Kallweit wrote: > > On 18.11.2021 22:28, Takashi Iwai wrote: > > On Thu, 18 Nov 2021 21:33:34 +0100, > > Heiner Kallweit wrote: > >> > >> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More > >> comprehensive PM runtime setup for controller driver"): > >> > >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! > >> > >> Not sure how this patch was tested because the warning is obvious. > >> The patch doesn't consider what the PCI sub-system does with regard to > >> RPM. Have a look at pci_pm_init(). > >> > >> I'd understand to add the call to pm_runtime_dont_use_autosuspend(), > >> but for all other added calls I see no justification. > >> > >> If being unsure about when to use which RPM call best involve > >> linux-pm@vger.kernel.org. > > > > Thanks for the notice. It's been through Intel CI and tests on a few > > local machines, maybe we haven't checked carefully those errors but > > only concentrated on the other issues, as it seems. > > > > There were two problems: one was the runtime PM being kicked off even > > during the PCI driver remove call, and another was the proper runtime > > PM setup after re-binding. > > > > Having a look at the commit message of "ALSA: hda: fix general protection > fault in azx_runtime_idle" the following sounds weird: > > - pci-driver.c:pm_runtime_put_sync() leads to a call > to rpm_idle() which again calls azx_runtime_idle() > > rpm_idle() is only called if usage_count is 1 when entering > pm_runtime_put_sync. And this should not be the case. > pm_runtime_get_sync() increments the usage counter before remove() > is called, and remove() should also increment the usage counter. > This doesn't seem to happen. Maybe for whatever reason > pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() > isn't called from remove(). > I think you should trace the call chain from the PCI core calling > remove() to pm_runtime_get_noresume() getting called or not. Neither of them, supposedly. Now I took a deeper look at the code around it and dug into the git log, and found that the likely problem was the recent PCI core code refactoring (removal of pci->driver, etc) that have been already reverted; that was why linux-next-20211109 was broken and linux-next-20211110 worked. With the leftover pci->driver, the stale runtime PM callback was called at the pm_runtime_put_sync() call in pci_device_remove(). In anyway, I'll drop the invalid calls of pm_runtime_enable() / disable() & co. Maybe keeping pm_runtime_forbid() and pm_runtime_dont_use_autosuspend() at remove still makes some sense as a counter-part for the probe calls, though. thanks, Takashi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" 2021-11-19 13:51 ` Takashi Iwai @ 2021-11-19 15:10 ` Kai Vehmanen 2021-11-19 20:57 ` Heiner Kallweit 1 sibling, 0 replies; 7+ messages in thread From: Kai Vehmanen @ 2021-11-19 15:10 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel, Linux PM, Heiner Kallweit Hi, On Fri, 19 Nov 2021, Takashi Iwai wrote: > On Thu, 18 Nov 2021 23:13:50 +0100, Heiner Kallweit wrote: > > On 18.11.2021 22:28, Takashi Iwai wrote: > > >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! > > >> > > >> Not sure how this patch was tested because the warning is obvious. > > >> The patch doesn't consider what the PCI sub-system does with regard to > > >> RPM. Have a look at pci_pm_init(). [...] > > > > > > Thanks for the notice. It's been through Intel CI and tests on a few > > > local machines, maybe we haven't checked carefully those errors but > > > only concentrated on the other issues, as it seems. ack on this. This did not go through our full CI, but rather I tested with the local setup I used to bisect this problem that was reported with linux-next. We specifically got the "Unbalanced pm_runtime" warning on the first iteration of the patch, but it was clean in the submitted version. But yeah, a wider test round would have caught this, sorry about that. > > Having a look at the commit message of "ALSA: hda: fix general protection > > fault in azx_runtime_idle" the following sounds weird: > > > > - pci-driver.c:pm_runtime_put_sync() leads to a call > > to rpm_idle() which again calls azx_runtime_idle() > > > > rpm_idle() is only called if usage_count is 1 when entering [...] > > This doesn't seem to happen. Maybe for whatever reason > > pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() > > isn't called from remove(). > > I think you should trace the call chain from the PCI core calling > > remove() to pm_runtime_get_noresume() getting called or not. > > Neither of them, supposedly. Now I took a deeper look at the code > around it and dug into the git log, and found that the likely problem > was the recent PCI core code refactoring (removal of pci->driver, etc) > that have been already reverted; that was why linux-next-20211109 was > broken and linux-next-20211110 worked. With the leftover pci->driver, > the stale runtime PM callback was called at the pm_runtime_put_sync() > call in pci_device_remove(). That probably explains it. I specifically saw runtime idle callback _after_ the PCI remove driver callback (azx_remove()) was run to completion. And this happened within execution of pci_device_remove(). But alas I could not hit this problem with post 20211110 linux-next. > In anyway, I'll drop the invalid calls of pm_runtime_enable() / > disable() & co. Maybe keeping pm_runtime_forbid() and > pm_runtime_dont_use_autosuspend() at remove still makes some sense as > a counter-part for the probe calls, though. Hmm, I was about to port this change to the SOF driver as well. But if the outcome is that driver can safely assume RPM callbacks are _not_ called after remove, then maybe we can keep the SOF implementation of remove as is. Br, Kai ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" @ 2021-11-19 15:10 ` Kai Vehmanen 0 siblings, 0 replies; 7+ messages in thread From: Kai Vehmanen @ 2021-11-19 15:10 UTC (permalink / raw) To: Takashi Iwai; +Cc: Heiner Kallweit, alsa-devel, Linux PM Hi, On Fri, 19 Nov 2021, Takashi Iwai wrote: > On Thu, 18 Nov 2021 23:13:50 +0100, Heiner Kallweit wrote: > > On 18.11.2021 22:28, Takashi Iwai wrote: > > >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! > > >> > > >> Not sure how this patch was tested because the warning is obvious. > > >> The patch doesn't consider what the PCI sub-system does with regard to > > >> RPM. Have a look at pci_pm_init(). [...] > > > > > > Thanks for the notice. It's been through Intel CI and tests on a few > > > local machines, maybe we haven't checked carefully those errors but > > > only concentrated on the other issues, as it seems. ack on this. This did not go through our full CI, but rather I tested with the local setup I used to bisect this problem that was reported with linux-next. We specifically got the "Unbalanced pm_runtime" warning on the first iteration of the patch, but it was clean in the submitted version. But yeah, a wider test round would have caught this, sorry about that. > > Having a look at the commit message of "ALSA: hda: fix general protection > > fault in azx_runtime_idle" the following sounds weird: > > > > - pci-driver.c:pm_runtime_put_sync() leads to a call > > to rpm_idle() which again calls azx_runtime_idle() > > > > rpm_idle() is only called if usage_count is 1 when entering [...] > > This doesn't seem to happen. Maybe for whatever reason > > pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() > > isn't called from remove(). > > I think you should trace the call chain from the PCI core calling > > remove() to pm_runtime_get_noresume() getting called or not. > > Neither of them, supposedly. Now I took a deeper look at the code > around it and dug into the git log, and found that the likely problem > was the recent PCI core code refactoring (removal of pci->driver, etc) > that have been already reverted; that was why linux-next-20211109 was > broken and linux-next-20211110 worked. With the leftover pci->driver, > the stale runtime PM callback was called at the pm_runtime_put_sync() > call in pci_device_remove(). That probably explains it. I specifically saw runtime idle callback _after_ the PCI remove driver callback (azx_remove()) was run to completion. And this happened within execution of pci_device_remove(). But alas I could not hit this problem with post 20211110 linux-next. > In anyway, I'll drop the invalid calls of pm_runtime_enable() / > disable() & co. Maybe keeping pm_runtime_forbid() and > pm_runtime_dont_use_autosuspend() at remove still makes some sense as > a counter-part for the probe calls, though. Hmm, I was about to port this change to the SOF driver as well. But if the outcome is that driver can safely assume RPM callbacks are _not_ called after remove, then maybe we can keep the SOF implementation of remove as is. Br, Kai ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" 2021-11-19 13:51 ` Takashi Iwai 2021-11-19 15:10 ` Kai Vehmanen @ 2021-11-19 20:57 ` Heiner Kallweit 1 sibling, 0 replies; 7+ messages in thread From: Heiner Kallweit @ 2021-11-19 20:57 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel, Linux PM On 19.11.2021 14:51, Takashi Iwai wrote: > On Thu, 18 Nov 2021 23:13:50 +0100, > Heiner Kallweit wrote: >> >> On 18.11.2021 22:28, Takashi Iwai wrote: >>> On Thu, 18 Nov 2021 21:33:34 +0100, >>> Heiner Kallweit wrote: >>>> >>>> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More >>>> comprehensive PM runtime setup for controller driver"): >>>> >>>> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! >>>> >>>> Not sure how this patch was tested because the warning is obvious. >>>> The patch doesn't consider what the PCI sub-system does with regard to >>>> RPM. Have a look at pci_pm_init(). >>>> >>>> I'd understand to add the call to pm_runtime_dont_use_autosuspend(), >>>> but for all other added calls I see no justification. >>>> >>>> If being unsure about when to use which RPM call best involve >>>> linux-pm@vger.kernel.org. >>> >>> Thanks for the notice. It's been through Intel CI and tests on a few >>> local machines, maybe we haven't checked carefully those errors but >>> only concentrated on the other issues, as it seems. >>> >>> There were two problems: one was the runtime PM being kicked off even >>> during the PCI driver remove call, and another was the proper runtime >>> PM setup after re-binding. >>> >> >> Having a look at the commit message of "ALSA: hda: fix general protection >> fault in azx_runtime_idle" the following sounds weird: >> >> - pci-driver.c:pm_runtime_put_sync() leads to a call >> to rpm_idle() which again calls azx_runtime_idle() >> >> rpm_idle() is only called if usage_count is 1 when entering >> pm_runtime_put_sync. And this should not be the case. >> pm_runtime_get_sync() increments the usage counter before remove() >> is called, and remove() should also increment the usage counter. >> This doesn't seem to happen. Maybe for whatever reason >> pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() >> isn't called from remove(). >> I think you should trace the call chain from the PCI core calling >> remove() to pm_runtime_get_noresume() getting called or not. > > Neither of them, supposedly. Now I took a deeper look at the code > around it and dug into the git log, and found that the likely problem > was the recent PCI core code refactoring (removal of pci->driver, etc) > that have been already reverted; that was why linux-next-20211109 was > broken and linux-next-20211110 worked. With the leftover pci->driver, > the stale runtime PM callback was called at the pm_runtime_put_sync() > call in pci_device_remove(). > I also noticed that partially I was on the wrong path. > In anyway, I'll drop the invalid calls of pm_runtime_enable() / > disable() & co. Maybe keeping pm_runtime_forbid() and > pm_runtime_dont_use_autosuspend() at remove still makes some sense as > a counter-part for the probe calls, though. > The call to pm_runtime_forbid() in pci_pm_init() is a heritage from early ACPI times when broken ACPI implementations had problems with RPM. There's a discussion (w/o result yet) to enable RPM per default for newer ACPI versions. Calling pm_runtime_forbid() in the driver removal path isn't strictly needed but it doesn't hurt. > > thanks, > > Takashi > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-11-22 7:05 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-11-18 20:33 Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" Heiner Kallweit 2021-11-18 21:28 ` Takashi Iwai 2021-11-18 22:13 ` Heiner Kallweit 2021-11-19 13:51 ` Takashi Iwai 2021-11-19 15:10 ` Kai Vehmanen 2021-11-19 15:10 ` Kai Vehmanen 2021-11-19 20:57 ` Heiner Kallweit
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.