* hda codec unbind refcount hang
@ 2022-09-09 15:45 Ville Syrjälä
2022-09-09 15:59 ` Takashi Iwai
0 siblings, 1 reply; 5+ messages in thread
From: Ville Syrjälä @ 2022-09-09 15:45 UTC (permalink / raw)
To: Takashi Iwai; +Cc: alsa-devel
Hi Takashi,
commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec
unbinding") introduced a problem on at least one of my older machines.
The problem happens when hda_codec_driver_remove() encounters a
codec without any pcms (and thus the refcount is 1) and tries to
call refcount_dec(). Turns out refcount_dec() doesn't like to be
used for dropping the refcount to 0, and instead if spews a warning
and does its saturate thing. The subsequent wait_event() is then
permanently stuck waiting on the saturated refcount.
I've definitely seen the same kind of pattern used elsewhere
in the kernel as well, so the fact that refcount_t can't be used
to implement it is a bit of surprise to me. I guess most other
places still use atomic_t instead.
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: hda codec unbind refcount hang 2022-09-09 15:45 hda codec unbind refcount hang Ville Syrjälä @ 2022-09-09 15:59 ` Takashi Iwai 2022-09-09 19:39 ` Ville Syrjälä 0 siblings, 1 reply; 5+ messages in thread From: Takashi Iwai @ 2022-09-09 15:59 UTC (permalink / raw) To: Ville Syrjälä; +Cc: alsa-devel On Fri, 09 Sep 2022 17:45:25 +0200, Ville Syrjälä wrote: > > Hi Takashi, > > commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec > unbinding") introduced a problem on at least one of my older machines. > > The problem happens when hda_codec_driver_remove() encounters a > codec without any pcms (and thus the refcount is 1) and tries to > call refcount_dec(). Turns out refcount_dec() doesn't like to be > used for dropping the refcount to 0, and instead if spews a warning > and does its saturate thing. The subsequent wait_event() is then > permanently stuck waiting on the saturated refcount. > > I've definitely seen the same kind of pattern used elsewhere > in the kernel as well, so the fact that refcount_t can't be used > to implement it is a bit of surprise to me. I guess most other > places still use atomic_t instead. Does the patch below work around it? It seem to be a subtle difference between refcount_dec() and refcount_dec_and_test(). thanks, Takashi -- 8< -- --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -157,10 +157,11 @@ static int hda_codec_driver_remove(struct device *dev) return codec->bus->core.ext_ops->hdev_detach(&codec->core); } - refcount_dec(&codec->pcm_ref); - snd_hda_codec_disconnect_pcms(codec); - snd_hda_jack_tbl_disconnect(codec); - wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); + if (!refcount_dec_and_test(&codec->pcm_ref)) { + snd_hda_codec_disconnect_pcms(codec); + snd_hda_jack_tbl_disconnect(codec); + wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); + } snd_power_sync_ref(codec->bus->card); if (codec->patch_ops.free) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hda codec unbind refcount hang 2022-09-09 15:59 ` Takashi Iwai @ 2022-09-09 19:39 ` Ville Syrjälä 2022-09-10 10:22 ` Takashi Iwai 0 siblings, 1 reply; 5+ messages in thread From: Ville Syrjälä @ 2022-09-09 19:39 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Fri, Sep 09, 2022 at 05:59:47PM +0200, Takashi Iwai wrote: > On Fri, 09 Sep 2022 17:45:25 +0200, > Ville Syrjälä wrote: > > > > Hi Takashi, > > > > commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec > > unbinding") introduced a problem on at least one of my older machines. > > > > The problem happens when hda_codec_driver_remove() encounters a > > codec without any pcms (and thus the refcount is 1) and tries to > > call refcount_dec(). Turns out refcount_dec() doesn't like to be > > used for dropping the refcount to 0, and instead if spews a warning > > and does its saturate thing. The subsequent wait_event() is then > > permanently stuck waiting on the saturated refcount. > > > > I've definitely seen the same kind of pattern used elsewhere > > in the kernel as well, so the fact that refcount_t can't be used > > to implement it is a bit of surprise to me. I guess most other > > places still use atomic_t instead. > > Does the patch below work around it? It seem to be a subtle > difference between refcount_dec() and refcount_dec_and_test(). Aye, this works. Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > thanks, > > Takashi > > -- 8< -- > --- a/sound/pci/hda/hda_bind.c > +++ b/sound/pci/hda/hda_bind.c > @@ -157,10 +157,11 @@ static int hda_codec_driver_remove(struct device *dev) > return codec->bus->core.ext_ops->hdev_detach(&codec->core); > } > > - refcount_dec(&codec->pcm_ref); > - snd_hda_codec_disconnect_pcms(codec); > - snd_hda_jack_tbl_disconnect(codec); > - wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); > + if (!refcount_dec_and_test(&codec->pcm_ref)) { > + snd_hda_codec_disconnect_pcms(codec); > + snd_hda_jack_tbl_disconnect(codec); > + wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); > + } > snd_power_sync_ref(codec->bus->card); > > if (codec->patch_ops.free) -- Ville Syrjälä Intel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hda codec unbind refcount hang 2022-09-09 19:39 ` Ville Syrjälä @ 2022-09-10 10:22 ` Takashi Iwai 2022-09-10 11:54 ` Ville Syrjälä 0 siblings, 1 reply; 5+ messages in thread From: Takashi Iwai @ 2022-09-10 10:22 UTC (permalink / raw) To: Ville Syrjälä; +Cc: alsa-devel On Fri, 09 Sep 2022 21:39:19 +0200, Ville Syrjälä wrote: > > On Fri, Sep 09, 2022 at 05:59:47PM +0200, Takashi Iwai wrote: > > On Fri, 09 Sep 2022 17:45:25 +0200, > > Ville Syrjälä wrote: > > > > > > Hi Takashi, > > > > > > commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec > > > unbinding") introduced a problem on at least one of my older machines. > > > > > > The problem happens when hda_codec_driver_remove() encounters a > > > codec without any pcms (and thus the refcount is 1) and tries to > > > call refcount_dec(). Turns out refcount_dec() doesn't like to be > > > used for dropping the refcount to 0, and instead if spews a warning > > > and does its saturate thing. The subsequent wait_event() is then > > > permanently stuck waiting on the saturated refcount. > > > > > > I've definitely seen the same kind of pattern used elsewhere > > > in the kernel as well, so the fact that refcount_t can't be used > > > to implement it is a bit of surprise to me. I guess most other > > > places still use atomic_t instead. > > > > Does the patch below work around it? It seem to be a subtle > > difference between refcount_dec() and refcount_dec_and_test(). > > Aye, this works. > > Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Good to hear. I think the below is slightly safer, assuring the other *_disconnect() calls. Could you give it a try again? Once after confirming it works, I'll re-submit and merge to my tree. thanks, Takashi -- 8< -- From: Takashi Iwai <tiwai@suse.de> Subject: [PATCH] ALSA: hda: Fix hang at HD-audio codec unbinding due to refcount saturation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We fixed the potential deadlock at dynamic unbinding the HD-audio codec at the commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec unbinding"), but ironically, this caused another potential deadlock. The current code uses refcount_dec() and waits for the pending task with wait_event for dropping the refcount to 0. This works fine when PCMs are assigned and actually waiting for the refcount drop. Meanwhile, when there was no PCM assigned, the refcount_dec() call itself was supposed to drop to zero -- alas, it doesn't in reality; refcount_dec() complains, spews kernel warning and it saturates instead of dropping to 0, due to the nature of refcount_dec() implementation. This eventually blocks the wait_event() wakeup and the code get stuck there. For avoiding the problem, we call refcount_dec_and_test() and skips the sync-wait if it already reaches to zero. The patch does a slight code reshuffling to make sure to invoke other disconnect calls before the sync-wait, too. Fixes: 7206998f578d ("ALSA: hda: Fix potential deadlock at codec unbinding") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/YxtflWQnslMHVlU7@intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de> --- sound/pci/hda/hda_bind.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/pci/hda/hda_bind.c b/sound/pci/hda/hda_bind.c index cae9a975cbcc..1a868dd9dc4b 100644 --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -157,10 +157,10 @@ static int hda_codec_driver_remove(struct device *dev) return codec->bus->core.ext_ops->hdev_detach(&codec->core); } - refcount_dec(&codec->pcm_ref); snd_hda_codec_disconnect_pcms(codec); snd_hda_jack_tbl_disconnect(codec); - wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); + if (!refcount_dec_and_test(&codec->pcm_ref)) + wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); snd_power_sync_ref(codec->bus->card); if (codec->patch_ops.free) -- 2.35.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: hda codec unbind refcount hang 2022-09-10 10:22 ` Takashi Iwai @ 2022-09-10 11:54 ` Ville Syrjälä 0 siblings, 0 replies; 5+ messages in thread From: Ville Syrjälä @ 2022-09-10 11:54 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Sat, Sep 10, 2022 at 12:22:04PM +0200, Takashi Iwai wrote: > On Fri, 09 Sep 2022 21:39:19 +0200, > Ville Syrjälä wrote: > > > > On Fri, Sep 09, 2022 at 05:59:47PM +0200, Takashi Iwai wrote: > > > On Fri, 09 Sep 2022 17:45:25 +0200, > > > Ville Syrjälä wrote: > > > > > > > > Hi Takashi, > > > > > > > > commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec > > > > unbinding") introduced a problem on at least one of my older machines. > > > > > > > > The problem happens when hda_codec_driver_remove() encounters a > > > > codec without any pcms (and thus the refcount is 1) and tries to > > > > call refcount_dec(). Turns out refcount_dec() doesn't like to be > > > > used for dropping the refcount to 0, and instead if spews a warning > > > > and does its saturate thing. The subsequent wait_event() is then > > > > permanently stuck waiting on the saturated refcount. > > > > > > > > I've definitely seen the same kind of pattern used elsewhere > > > > in the kernel as well, so the fact that refcount_t can't be used > > > > to implement it is a bit of surprise to me. I guess most other > > > > places still use atomic_t instead. > > > > > > Does the patch below work around it? It seem to be a subtle > > > difference between refcount_dec() and refcount_dec_and_test(). > > > > Aye, this works. > > > > Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Good to hear. > > I think the below is slightly safer, assuring the other *_disconnect() > calls. > > Could you give it a try again? Once after confirming it works, I'll > re-submit and merge to my tree. This works too. Thanks Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > thanks, > > Takashi > > -- 8< -- > From: Takashi Iwai <tiwai@suse.de> > Subject: [PATCH] ALSA: hda: Fix hang at HD-audio codec unbinding due to refcount saturation > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > We fixed the potential deadlock at dynamic unbinding the HD-audio > codec at the commit 7206998f578d ("ALSA: hda: Fix potential deadlock > at codec unbinding"), but ironically, this caused another potential > deadlock. The current code uses refcount_dec() and waits for the > pending task with wait_event for dropping the refcount to 0. This > works fine when PCMs are assigned and actually waiting for the > refcount drop. > > Meanwhile, when there was no PCM assigned, the refcount_dec() call > itself was supposed to drop to zero -- alas, it doesn't in reality; > refcount_dec() complains, spews kernel warning and it saturates > instead of dropping to 0, due to the nature of refcount_dec() > implementation. This eventually blocks the wait_event() wakeup and > the code get stuck there. > > For avoiding the problem, we call refcount_dec_and_test() and skips > the sync-wait if it already reaches to zero. > > The patch does a slight code reshuffling to make sure to invoke other > disconnect calls before the sync-wait, too. > > Fixes: 7206998f578d ("ALSA: hda: Fix potential deadlock at codec unbinding") > Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > Cc: <stable@vger.kernel.org> > Link: https://lore.kernel.org/r/YxtflWQnslMHVlU7@intel.com > Signed-off-by: Takashi Iwai <tiwai@suse.de> > --- > sound/pci/hda/hda_bind.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sound/pci/hda/hda_bind.c b/sound/pci/hda/hda_bind.c > index cae9a975cbcc..1a868dd9dc4b 100644 > --- a/sound/pci/hda/hda_bind.c > +++ b/sound/pci/hda/hda_bind.c > @@ -157,10 +157,10 @@ static int hda_codec_driver_remove(struct device *dev) > return codec->bus->core.ext_ops->hdev_detach(&codec->core); > } > > - refcount_dec(&codec->pcm_ref); > snd_hda_codec_disconnect_pcms(codec); > snd_hda_jack_tbl_disconnect(codec); > - wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); > + if (!refcount_dec_and_test(&codec->pcm_ref)) > + wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); > snd_power_sync_ref(codec->bus->card); > > if (codec->patch_ops.free) > -- > 2.35.3 -- Ville Syrjälä Intel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-09-10 11:55 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-09-09 15:45 hda codec unbind refcount hang Ville Syrjälä 2022-09-09 15:59 ` Takashi Iwai 2022-09-09 19:39 ` Ville Syrjälä 2022-09-10 10:22 ` Takashi Iwai 2022-09-10 11:54 ` Ville Syrjälä
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.