Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
@ 2021-11-18 20:33 Heiner Kallweit
  2021-11-18 21:28 ` Takashi Iwai
  0 siblings, 1 reply; 6+ messages in thread
From: Heiner Kallweit @ 2021-11-18 20:33 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Linux PM

I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
comprehensive PM runtime setup for controller driver"):

snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!

Not sure how this patch was tested because the warning is obvious.
The patch doesn't consider what the PCI sub-system does with regard to
RPM. Have a look at pci_pm_init().

I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
but for all other added calls I see no justification.

If being unsure about when to use which RPM call best involve
linux-pm@vger.kernel.org.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
  2021-11-18 20:33 Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" Heiner Kallweit
@ 2021-11-18 21:28 ` Takashi Iwai
  2021-11-18 22:13   ` Heiner Kallweit
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2021-11-18 21:28 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: alsa-devel, Linux PM

On Thu, 18 Nov 2021 21:33:34 +0100,
Heiner Kallweit wrote:
> 
> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
> comprehensive PM runtime setup for controller driver"):
> 
> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
> 
> Not sure how this patch was tested because the warning is obvious.
> The patch doesn't consider what the PCI sub-system does with regard to
> RPM. Have a look at pci_pm_init().
> 
> I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
> but for all other added calls I see no justification.
> 
> If being unsure about when to use which RPM call best involve
> linux-pm@vger.kernel.org.

Thanks for the notice.  It's been through Intel CI and tests on a few
local machines, maybe we haven't checked carefully those errors but
only concentrated on the other issues, as it seems.

There were two problems: one was the runtime PM being kicked off even
during the PCI driver remove call, and another was the proper runtime
PM setup after re-binding.

For avoiding the former, only the pm_runtime_forbid() (and maybe
pm_runtime_dont_use_autosuspend(), too) would suffice?  Also, for PCI
device, no need for pm_runtime_set_supended() at remove, right?

thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
  2021-11-18 21:28 ` Takashi Iwai
@ 2021-11-18 22:13   ` Heiner Kallweit
  2021-11-19 13:51     ` Takashi Iwai
  0 siblings, 1 reply; 6+ messages in thread
From: Heiner Kallweit @ 2021-11-18 22:13 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Linux PM

On 18.11.2021 22:28, Takashi Iwai wrote:
> On Thu, 18 Nov 2021 21:33:34 +0100,
> Heiner Kallweit wrote:
>>
>> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
>> comprehensive PM runtime setup for controller driver"):
>>
>> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
>>
>> Not sure how this patch was tested because the warning is obvious.
>> The patch doesn't consider what the PCI sub-system does with regard to
>> RPM. Have a look at pci_pm_init().
>>
>> I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
>> but for all other added calls I see no justification.
>>
>> If being unsure about when to use which RPM call best involve
>> linux-pm@vger.kernel.org.
> 
> Thanks for the notice.  It's been through Intel CI and tests on a few
> local machines, maybe we haven't checked carefully those errors but
> only concentrated on the other issues, as it seems.
> 
> There were two problems: one was the runtime PM being kicked off even
> during the PCI driver remove call, and another was the proper runtime
> PM setup after re-binding.
> 

Having a look at the commit message of "ALSA: hda: fix general protection
fault in azx_runtime_idle" the following sounds weird:

  - pci-driver.c:pm_runtime_put_sync() leads to a call
    to rpm_idle() which again calls azx_runtime_idle()

rpm_idle() is only called if usage_count is 1 when entering
pm_runtime_put_sync. And this should not be the case.
pm_runtime_get_sync() increments the usage counter before remove()
is called, and remove() should also increment the usage counter.
This doesn't seem to happen. Maybe for whatever reason 
pm_runtime_get_noresume() isn't called in azx_free(), or azx_free()
isn't called from remove().
I think you should trace the call chain from the PCI core calling
remove() to pm_runtime_get_noresume() getting called or not.


> For avoiding the former, only the pm_runtime_forbid() (and maybe
> pm_runtime_dont_use_autosuspend(), too) would suffice?  Also, for PCI
> device, no need for pm_runtime_set_supended() at remove, right?
> 
> 
> thanks,
> 
> Takashi
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
  2021-11-18 22:13   ` Heiner Kallweit
@ 2021-11-19 13:51     ` Takashi Iwai
  2021-11-19 15:10       ` Kai Vehmanen
  2021-11-19 20:57       ` Heiner Kallweit
  0 siblings, 2 replies; 6+ messages in thread
From: Takashi Iwai @ 2021-11-19 13:51 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: alsa-devel, Linux PM

On Thu, 18 Nov 2021 23:13:50 +0100,
Heiner Kallweit wrote:
> 
> On 18.11.2021 22:28, Takashi Iwai wrote:
> > On Thu, 18 Nov 2021 21:33:34 +0100,
> > Heiner Kallweit wrote:
> >>
> >> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
> >> comprehensive PM runtime setup for controller driver"):
> >>
> >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
> >>
> >> Not sure how this patch was tested because the warning is obvious.
> >> The patch doesn't consider what the PCI sub-system does with regard to
> >> RPM. Have a look at pci_pm_init().
> >>
> >> I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
> >> but for all other added calls I see no justification.
> >>
> >> If being unsure about when to use which RPM call best involve
> >> linux-pm@vger.kernel.org.
> > 
> > Thanks for the notice.  It's been through Intel CI and tests on a few
> > local machines, maybe we haven't checked carefully those errors but
> > only concentrated on the other issues, as it seems.
> > 
> > There were two problems: one was the runtime PM being kicked off even
> > during the PCI driver remove call, and another was the proper runtime
> > PM setup after re-binding.
> > 
> 
> Having a look at the commit message of "ALSA: hda: fix general protection
> fault in azx_runtime_idle" the following sounds weird:
> 
>   - pci-driver.c:pm_runtime_put_sync() leads to a call
>     to rpm_idle() which again calls azx_runtime_idle()
> 
> rpm_idle() is only called if usage_count is 1 when entering
> pm_runtime_put_sync. And this should not be the case.
> pm_runtime_get_sync() increments the usage counter before remove()
> is called, and remove() should also increment the usage counter.
> This doesn't seem to happen. Maybe for whatever reason 
> pm_runtime_get_noresume() isn't called in azx_free(), or azx_free()
> isn't called from remove().
> I think you should trace the call chain from the PCI core calling
> remove() to pm_runtime_get_noresume() getting called or not.

Neither of them, supposedly.  Now I took a deeper look at the code
around it and dug into the git log, and found that the likely problem
was the recent PCI core code refactoring (removal of pci->driver, etc)
that have been already reverted; that was why linux-next-20211109 was
broken and linux-next-20211110 worked.  With the leftover pci->driver,
the stale runtime PM callback was called at the pm_runtime_put_sync()
call in pci_device_remove().

In anyway, I'll drop the invalid calls of pm_runtime_enable() /
disable() & co.  Maybe keeping pm_runtime_forbid() and
pm_runtime_dont_use_autosuspend() at remove still makes some sense as
a counter-part for the probe calls, though.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
  2021-11-19 13:51     ` Takashi Iwai
@ 2021-11-19 15:10       ` Kai Vehmanen
  2021-11-19 20:57       ` Heiner Kallweit
  1 sibling, 0 replies; 6+ messages in thread
From: Kai Vehmanen @ 2021-11-19 15:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Heiner Kallweit, alsa-devel, Linux PM

Hi,

On Fri, 19 Nov 2021, Takashi Iwai wrote:

> On Thu, 18 Nov 2021 23:13:50 +0100, Heiner Kallweit wrote:
> > On 18.11.2021 22:28, Takashi Iwai wrote:
> > >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
> > >>
> > >> Not sure how this patch was tested because the warning is obvious.
> > >> The patch doesn't consider what the PCI sub-system does with regard to
> > >> RPM. Have a look at pci_pm_init().
[...]
> > > 
> > > Thanks for the notice.  It's been through Intel CI and tests on a few
> > > local machines, maybe we haven't checked carefully those errors but
> > > only concentrated on the other issues, as it seems.

ack on this. This did not go through our full CI, but rather I tested 
with the local setup I used to bisect this problem that was reported with 
linux-next. We specifically got the "Unbalanced pm_runtime" warning on the 
first iteration of the patch, but it was clean in the submitted version.
But yeah, a wider test round would have caught this, sorry about that.

> > Having a look at the commit message of "ALSA: hda: fix general protection
> > fault in azx_runtime_idle" the following sounds weird:
> > 
> >   - pci-driver.c:pm_runtime_put_sync() leads to a call
> >     to rpm_idle() which again calls azx_runtime_idle()
> > 
> > rpm_idle() is only called if usage_count is 1 when entering
[...]
> > This doesn't seem to happen. Maybe for whatever reason 
> > pm_runtime_get_noresume() isn't called in azx_free(), or azx_free()
> > isn't called from remove().
> > I think you should trace the call chain from the PCI core calling
> > remove() to pm_runtime_get_noresume() getting called or not.
> 
> Neither of them, supposedly.  Now I took a deeper look at the code
> around it and dug into the git log, and found that the likely problem
> was the recent PCI core code refactoring (removal of pci->driver, etc)
> that have been already reverted; that was why linux-next-20211109 was
> broken and linux-next-20211110 worked.  With the leftover pci->driver,
> the stale runtime PM callback was called at the pm_runtime_put_sync()
> call in pci_device_remove().

That probably explains it. I specifically saw runtime idle callback 
_after_ the PCI remove driver callback (azx_remove()) was run to 
completion. And this happened within execution of pci_device_remove(). But 
alas I could not hit this problem with post 20211110 linux-next.

> In anyway, I'll drop the invalid calls of pm_runtime_enable() /
> disable() & co.  Maybe keeping pm_runtime_forbid() and
> pm_runtime_dont_use_autosuspend() at remove still makes some sense as
> a counter-part for the probe calls, though.

Hmm, I was about to port this change to the SOF driver as well. But if the 
outcome is that driver can safely assume RPM callbacks are _not_ called 
after remove, then maybe we can keep the SOF implementation of 
remove as is.

Br, Kai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"
  2021-11-19 13:51     ` Takashi Iwai
  2021-11-19 15:10       ` Kai Vehmanen
@ 2021-11-19 20:57       ` Heiner Kallweit
  1 sibling, 0 replies; 6+ messages in thread
From: Heiner Kallweit @ 2021-11-19 20:57 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Linux PM

On 19.11.2021 14:51, Takashi Iwai wrote:
> On Thu, 18 Nov 2021 23:13:50 +0100,
> Heiner Kallweit wrote:
>>
>> On 18.11.2021 22:28, Takashi Iwai wrote:
>>> On Thu, 18 Nov 2021 21:33:34 +0100,
>>> Heiner Kallweit wrote:
>>>>
>>>> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
>>>> comprehensive PM runtime setup for controller driver"):
>>>>
>>>> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
>>>>
>>>> Not sure how this patch was tested because the warning is obvious.
>>>> The patch doesn't consider what the PCI sub-system does with regard to
>>>> RPM. Have a look at pci_pm_init().
>>>>
>>>> I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
>>>> but for all other added calls I see no justification.
>>>>
>>>> If being unsure about when to use which RPM call best involve
>>>> linux-pm@vger.kernel.org.
>>>
>>> Thanks for the notice.  It's been through Intel CI and tests on a few
>>> local machines, maybe we haven't checked carefully those errors but
>>> only concentrated on the other issues, as it seems.
>>>
>>> There were two problems: one was the runtime PM being kicked off even
>>> during the PCI driver remove call, and another was the proper runtime
>>> PM setup after re-binding.
>>>
>>
>> Having a look at the commit message of "ALSA: hda: fix general protection
>> fault in azx_runtime_idle" the following sounds weird:
>>
>>   - pci-driver.c:pm_runtime_put_sync() leads to a call
>>     to rpm_idle() which again calls azx_runtime_idle()
>>
>> rpm_idle() is only called if usage_count is 1 when entering
>> pm_runtime_put_sync. And this should not be the case.
>> pm_runtime_get_sync() increments the usage counter before remove()
>> is called, and remove() should also increment the usage counter.
>> This doesn't seem to happen. Maybe for whatever reason 
>> pm_runtime_get_noresume() isn't called in azx_free(), or azx_free()
>> isn't called from remove().
>> I think you should trace the call chain from the PCI core calling
>> remove() to pm_runtime_get_noresume() getting called or not.
> 
> Neither of them, supposedly.  Now I took a deeper look at the code
> around it and dug into the git log, and found that the likely problem
> was the recent PCI core code refactoring (removal of pci->driver, etc)
> that have been already reverted; that was why linux-next-20211109 was
> broken and linux-next-20211110 worked.  With the leftover pci->driver,
> the stale runtime PM callback was called at the pm_runtime_put_sync()
> call in pci_device_remove().
> 
I also noticed that partially I was on the wrong path.

> In anyway, I'll drop the invalid calls of pm_runtime_enable() /
> disable() & co.  Maybe keeping pm_runtime_forbid() and
> pm_runtime_dont_use_autosuspend() at remove still makes some sense as
> a counter-part for the probe calls, though.
> 
The call to pm_runtime_forbid() in pci_pm_init() is a heritage from
early ACPI times when broken ACPI implementations had problems with RPM.
There's a discussion (w/o result yet) to enable RPM per default for
newer ACPI versions.

Calling pm_runtime_forbid() in the driver removal path isn't strictly
needed but it doesn't hurt.

> 
> thanks,
> 
> Takashi
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-19 20:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-18 20:33 Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver" Heiner Kallweit
2021-11-18 21:28 ` Takashi Iwai
2021-11-18 22:13   ` Heiner Kallweit
2021-11-19 13:51     ` Takashi Iwai
2021-11-19 15:10       ` Kai Vehmanen
2021-11-19 20:57       ` Heiner Kallweit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).