All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiada Wang <jiada_wang@mentor.com>
To: Takashi Iwai <tiwai@suse.de>
Cc: perex@perex.cz, alsa-devel@alsa-project.org,
	apape@de.adit-jv.com, Mark_Craske@mentor.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3 v3] ALSA: usb-audio: fix race in snd_usb_endpoint_stop
Date: Mon, 5 Dec 2016 02:10:59 -0800	[thread overview]
Message-ID: <58453D33.5000006@mentor.com> (raw)
In-Reply-To: <s5hvav52vg8.wl-tiwai@suse.de>

Hi, Takashi

On 11/30/2016 01:00 AM, Takashi Iwai wrote:
> On Wed, 30 Nov 2016 08:59:23 +0100,
> Jiada Wang wrote:
>> From: Mark Craske<Mark_Craske@mentor.com>
>>
>> Kernel crash seen once:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000008
>> pgd = a1d7c000
>> [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000
>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>> CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1
>> task: a3ae61c0 ti: a08c8000 task.ti: a08c8000
>> PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio]
>> LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio]
>> pc : [<7f0eb22c>]    lr : [<7f0e57fc>]    psr: 200e0193
>> sp : a08c9c98  ip : a08c9ce8  fp : a08c9ce4
>> r10: 0000000a  r9 : 00000102  r8 : 94cb3000
>> r7 : 94cb3000  r6 : 94d0f000  r5 : 94d0e8e8  r4 : 94d0e000
>> r3 : 7f0eb21c  r2 : 00000000  r1 : 94cb3000  r0 : 00000000
>> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
>> Control: 10c5387d  Table: 31d7c04a  DAC: 00000015
>> Process dbus-daemon (pid: 250, stack limit = 0xa08c8238)
>> Stack: (0xa08c9c98 to 0xa08ca000)
>> ...
>> Backtrace:
>> [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio])
>> [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4)
>> [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0)
>> [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148)
>> [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380)
>> [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc)
>> [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8)
>> [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8)
>> [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78)
>> [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100)
>> [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8)
>> [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c)
>> [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0)
>> [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20)
>> Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008)
>> Kernel panic - not syncing: Fatal exception in interrupt
>>
>> There is a race between retire_capture_urb()&  stop_endpoints() which is
>> setting ep->data_subs to NULL. The underlying cause is in
>> snd_usb_endpoint_stop(), which sets
>>    ep->data_subs = NULL;
>>    ep->sync_slave = NULL;
>>    ep->retire_data_urb = NULL;
>>    ep->prepare_data_urb = NULL;
>>
>> An improvement, suggested by Andreas Pape (ADIT) is to read parameters
>> into local variables. This should solve race between stop and retire
>> where it is legal to store the pointers to local variable as they are
>> not freed in stop path but just set to NULL.
> Well, it's tricky.  A saner way would be to clear the stuff really
> after all users are gone.
>
> An untested patch is below.  Let me know if it really works.
Thanks for your proposal, I am afraid, we only found the race issue once 
during
our automation test, so I can't test for your patch,
but from what I can see, your patch makes sense to me.

The only difference when apply with your patch is, now in case 
stop_endpoints() is called
from TRIGGER_STOP, these stuff won't get cleared, but I think this isn't 
an issue, is it?

Thanks,
Jiada
>
>> However, additional races may still happen in close+hw_free against
>> retire, as there pointers may be freed in addition. For example, while
>> in retire_capture_urb() runtime->dma_area might be freed/nulled.
> This shouldn't be a problem, as stop_endpoints() waits until all
> active URBS get killed.
>
>
> thanks,
>
> Takashi
>
> ---
> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
> index c470251cea4b..f3fce9abece9 100644
> --- a/sound/usb/endpoint.c
> +++ b/sound/usb/endpoint.c
> @@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb)
>   	if (unlikely(atomic_read(&ep->chip->shutdown)))
>   		goto exit_clear;
>
> +	if (unlikely(!test_bit(EP_FLAG_RUNNING,&ep->flags)))
> +		goto exit_clear;
> +
>   	if (usb_pipeout(ep->pipe)) {
>   		retire_outbound_urb(ep, ctx);
>   		/* can be stopped during retire callback */
> @@ -534,6 +537,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
>   			alive, ep->ep_num);
>   	clear_bit(EP_FLAG_STOPPING,&ep->flags);
>
> +	ep->data_subs = NULL;
> +	ep->sync_slave = NULL;
> +	ep->retire_data_urb = NULL;
> +	ep->prepare_data_urb = NULL;
> +
>   	return 0;
>   }
>
> @@ -1006,10 +1014,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
>
>   	if (--ep->use_count == 0) {
>   		deactivate_urbs(ep, false);
> -		ep->data_subs = NULL;
> -		ep->sync_slave = NULL;
> -		ep->retire_data_urb = NULL;
> -		ep->prepare_data_urb = NULL;
>   		set_bit(EP_FLAG_STOPPING,&ep->flags);
>   	}
>   }

WARNING: multiple messages have this Message-ID (diff)
From: Jiada Wang <jiada_wang@mentor.com>
To: Takashi Iwai <tiwai@suse.de>
Cc: <perex@perex.cz>, <alsa-devel@alsa-project.org>,
	<apape@de.adit-jv.com>, <Mark_Craske@mentor.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/3 v3] ALSA: usb-audio: fix race in snd_usb_endpoint_stop
Date: Mon, 5 Dec 2016 02:10:59 -0800	[thread overview]
Message-ID: <58453D33.5000006@mentor.com> (raw)
In-Reply-To: <s5hvav52vg8.wl-tiwai@suse.de>

Hi, Takashi

On 11/30/2016 01:00 AM, Takashi Iwai wrote:
> On Wed, 30 Nov 2016 08:59:23 +0100,
> Jiada Wang wrote:
>> From: Mark Craske<Mark_Craske@mentor.com>
>>
>> Kernel crash seen once:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000008
>> pgd = a1d7c000
>> [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000
>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>> CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1
>> task: a3ae61c0 ti: a08c8000 task.ti: a08c8000
>> PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio]
>> LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio]
>> pc : [<7f0eb22c>]    lr : [<7f0e57fc>]    psr: 200e0193
>> sp : a08c9c98  ip : a08c9ce8  fp : a08c9ce4
>> r10: 0000000a  r9 : 00000102  r8 : 94cb3000
>> r7 : 94cb3000  r6 : 94d0f000  r5 : 94d0e8e8  r4 : 94d0e000
>> r3 : 7f0eb21c  r2 : 00000000  r1 : 94cb3000  r0 : 00000000
>> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
>> Control: 10c5387d  Table: 31d7c04a  DAC: 00000015
>> Process dbus-daemon (pid: 250, stack limit = 0xa08c8238)
>> Stack: (0xa08c9c98 to 0xa08ca000)
>> ...
>> Backtrace:
>> [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio])
>> [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4)
>> [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0)
>> [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148)
>> [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380)
>> [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc)
>> [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8)
>> [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8)
>> [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78)
>> [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100)
>> [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8)
>> [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c)
>> [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0)
>> [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20)
>> Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008)
>> Kernel panic - not syncing: Fatal exception in interrupt
>>
>> There is a race between retire_capture_urb()&  stop_endpoints() which is
>> setting ep->data_subs to NULL. The underlying cause is in
>> snd_usb_endpoint_stop(), which sets
>>    ep->data_subs = NULL;
>>    ep->sync_slave = NULL;
>>    ep->retire_data_urb = NULL;
>>    ep->prepare_data_urb = NULL;
>>
>> An improvement, suggested by Andreas Pape (ADIT) is to read parameters
>> into local variables. This should solve race between stop and retire
>> where it is legal to store the pointers to local variable as they are
>> not freed in stop path but just set to NULL.
> Well, it's tricky.  A saner way would be to clear the stuff really
> after all users are gone.
>
> An untested patch is below.  Let me know if it really works.
Thanks for your proposal, I am afraid, we only found the race issue once 
during
our automation test, so I can't test for your patch,
but from what I can see, your patch makes sense to me.

The only difference when apply with your patch is, now in case 
stop_endpoints() is called
from TRIGGER_STOP, these stuff won't get cleared, but I think this isn't 
an issue, is it?

Thanks,
Jiada
>
>> However, additional races may still happen in close+hw_free against
>> retire, as there pointers may be freed in addition. For example, while
>> in retire_capture_urb() runtime->dma_area might be freed/nulled.
> This shouldn't be a problem, as stop_endpoints() waits until all
> active URBS get killed.
>
>
> thanks,
>
> Takashi
>
> ---
> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
> index c470251cea4b..f3fce9abece9 100644
> --- a/sound/usb/endpoint.c
> +++ b/sound/usb/endpoint.c
> @@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb)
>   	if (unlikely(atomic_read(&ep->chip->shutdown)))
>   		goto exit_clear;
>
> +	if (unlikely(!test_bit(EP_FLAG_RUNNING,&ep->flags)))
> +		goto exit_clear;
> +
>   	if (usb_pipeout(ep->pipe)) {
>   		retire_outbound_urb(ep, ctx);
>   		/* can be stopped during retire callback */
> @@ -534,6 +537,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
>   			alive, ep->ep_num);
>   	clear_bit(EP_FLAG_STOPPING,&ep->flags);
>
> +	ep->data_subs = NULL;
> +	ep->sync_slave = NULL;
> +	ep->retire_data_urb = NULL;
> +	ep->prepare_data_urb = NULL;
> +
>   	return 0;
>   }
>
> @@ -1006,10 +1014,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
>
>   	if (--ep->use_count == 0) {
>   		deactivate_urbs(ep, false);
> -		ep->data_subs = NULL;
> -		ep->sync_slave = NULL;
> -		ep->retire_data_urb = NULL;
> -		ep->prepare_data_urb = NULL;
>   		set_bit(EP_FLAG_STOPPING,&ep->flags);
>   	}
>   }

  reply	other threads:[~2016-12-05 10:10 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-30  7:59 [PATCH 0/3 v1] usb-misc fix Jiada Wang
2016-11-30  7:59 ` Jiada Wang
2016-11-30  7:59 ` [PATCH 1/3 v1] ALSA: usb-audio: more tolerant packetsize Jiada Wang
2016-11-30  7:59   ` Jiada Wang
2016-11-30  8:54   ` Takashi Iwai
2016-11-30  8:54     ` Takashi Iwai
2016-12-01  7:04     ` Jiada Wang
2016-12-01  7:04       ` Jiada Wang
2016-12-01  7:41   ` [alsa-devel] " Clemens Ladisch
2016-12-01  8:58     ` Takashi Iwai
2016-12-01 11:16       ` Clemens Ladisch
2016-12-01 11:23         ` Takashi Iwai
2016-12-01 11:50           ` Clemens Ladisch
2016-12-01 11:36     ` Jiada Wang
2016-12-01 12:15       ` [alsa-devel] " Clemens Ladisch
2016-12-02  5:53         ` Jiada Wang
2016-11-30  7:59 ` [PATCH 2/3 v2] ALSA: usb-audio: avoid setting of sample rate multiple times on bus Jiada Wang
2016-11-30  7:59   ` Jiada Wang
2016-11-30  8:51   ` Takashi Iwai
2016-11-30  8:51     ` Takashi Iwai
2016-12-01  7:07     ` Jiada Wang
2016-12-01  7:07       ` Jiada Wang
2016-11-30 10:45   ` Takashi Sakamoto
2016-11-30 22:19     ` Takashi Sakamoto
2016-12-05  7:32     ` Jiada Wang
2016-12-05  7:32       ` Jiada Wang
2016-12-05  9:58       ` Takashi Sakamoto
2016-11-30  7:59 ` [PATCH 3/3 v3] ALSA: usb-audio: fix race in snd_usb_endpoint_stop Jiada Wang
2016-11-30  7:59   ` Jiada Wang
2016-11-30  9:00   ` Takashi Iwai
2016-11-30  9:00     ` Takashi Iwai
2016-12-05 10:10     ` Jiada Wang [this message]
2016-12-05 10:10       ` Jiada Wang
2016-12-05 10:30       ` Takashi Iwai
2016-12-05 10:30         ` Takashi Iwai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58453D33.5000006@mentor.com \
    --to=jiada_wang@mentor.com \
    --cc=Mark_Craske@mentor.com \
    --cc=alsa-devel@alsa-project.org \
    --cc=apape@de.adit-jv.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=perex@perex.cz \
    --cc=tiwai@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.