From: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
To: David Miller <davem@davemloft.net>
Cc: oneukum@suse.com, netdev@vger.kernel.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
Date: Wed, 19 Aug 2015 10:57:09 +0300 [thread overview]
Message-ID: <55D436D5.6010105@rosalab.ru> (raw)
In-Reply-To: <20150818.185407.1667358232705414236.davem@davemloft.net>
19.08.2015 04:54, David Miller пишет:
> From: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
> Date: Fri, 14 Aug 2015 19:58:36 +0300
>
>> 2. The second race is on dev->flags.
>>
>> dev->flags is set to 0 here:
>> *0 usbnet_stop (usbnet.c:816)
>> /* deferred work (task, timer, softirq) must also stop.
>> * can't flush_scheduled_work() until we drop rtnl (later),
>> * else workers could deadlock; so make workers a NOP.
>> */
>> dev->flags = 0;
>> del_timer_sync (&dev->delay);
>> tasklet_kill (&dev->bh);
>>
>> And here, the code clears EVENT_RX_KILL bit in dev->flags, which may
>> execute concurrently with the above operation:
>> *0 clear_bit (bitops.h:113, inlined)
>> *1 usbnet_bh (usbnet.c:1475)
>> /* restart RX again after disabling due to high error rate */
>> clear_bit(EVENT_RX_KILL, &dev->flags);
>>
>> It seems, setting dev->flags to 0 is not necessarily atomic w.r.t.
>> clear_bit() and other bit operations with dev->flags. It is safer to
>> make it atomic and this way, make the race harmless.
>>
>> While at it, the checking of EVENT_NO_RUNTIME_PM bit of dev->flags in
>> usbnet_stop() was fixed too: the bit should be checked before dev->flags
>> is cleared.
>
> The fix for this is excessive.
>
> Instead of all of this madness, looping over expensive clear_bit()
> atomics, just do whatever it takes to make sure that usbnet_bh() is
> quiesced and cannot execute any more. Then you can safely clear
> dev->flags normally.
>
If I understand it correctly, it is to make sure usbnet_bh() is not
scheduled again that dev->flags should be set to 0 first, one way or
another. That is what this madness is for.
tasklet_kill() will wait then for the already running instance of
usbnet_bh() (if one is running). After that, it is guaranteed BH is not
running and will not be re-scheduled.
As for the performance concerns, I doubt that usbnet_stop() is anywhere
on the critical path. I have been testing this patch for some time and
haven't seen any new performance issues with it yet.
If needed, it is possible to measure and compare the time needed for
usbnet_stop() before and after this patch and try to estimate the impact
of this on the overall performance.
Regards,
Eugene
WARNING: multiple messages have this Message-ID (diff)
From: Eugene Shatokhin <eugene.shatokhin-irhHPgl+04UvJsYlp49lxw@public.gmane.org>
To: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: oneukum-IBi9RG/b67k@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
Date: Wed, 19 Aug 2015 10:57:09 +0300 [thread overview]
Message-ID: <55D436D5.6010105@rosalab.ru> (raw)
In-Reply-To: <20150818.185407.1667358232705414236.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
19.08.2015 04:54, David Miller пишет:
> From: Eugene Shatokhin <eugene.shatokhin-irhHPgl+04UvJsYlp49lxw@public.gmane.org>
> Date: Fri, 14 Aug 2015 19:58:36 +0300
>
>> 2. The second race is on dev->flags.
>>
>> dev->flags is set to 0 here:
>> *0 usbnet_stop (usbnet.c:816)
>> /* deferred work (task, timer, softirq) must also stop.
>> * can't flush_scheduled_work() until we drop rtnl (later),
>> * else workers could deadlock; so make workers a NOP.
>> */
>> dev->flags = 0;
>> del_timer_sync (&dev->delay);
>> tasklet_kill (&dev->bh);
>>
>> And here, the code clears EVENT_RX_KILL bit in dev->flags, which may
>> execute concurrently with the above operation:
>> *0 clear_bit (bitops.h:113, inlined)
>> *1 usbnet_bh (usbnet.c:1475)
>> /* restart RX again after disabling due to high error rate */
>> clear_bit(EVENT_RX_KILL, &dev->flags);
>>
>> It seems, setting dev->flags to 0 is not necessarily atomic w.r.t.
>> clear_bit() and other bit operations with dev->flags. It is safer to
>> make it atomic and this way, make the race harmless.
>>
>> While at it, the checking of EVENT_NO_RUNTIME_PM bit of dev->flags in
>> usbnet_stop() was fixed too: the bit should be checked before dev->flags
>> is cleared.
>
> The fix for this is excessive.
>
> Instead of all of this madness, looping over expensive clear_bit()
> atomics, just do whatever it takes to make sure that usbnet_bh() is
> quiesced and cannot execute any more. Then you can safely clear
> dev->flags normally.
>
If I understand it correctly, it is to make sure usbnet_bh() is not
scheduled again that dev->flags should be set to 0 first, one way or
another. That is what this madness is for.
tasklet_kill() will wait then for the already running instance of
usbnet_bh() (if one is running). After that, it is guaranteed BH is not
running and will not be re-scheduled.
As for the performance concerns, I doubt that usbnet_stop() is anywhere
on the critical path. I have been testing this patch for some time and
haven't seen any new performance issues with it yet.
If needed, it is possible to measure and compare the time needed for
usbnet_stop() before and after this patch and try to estimate the impact
of this on the overall performance.
Regards,
Eugene
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-08-19 7:57 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 18:13 Several races in "usbnet" module (kernel 4.1.x) Eugene Shatokhin
2015-07-21 12:04 ` Oliver Neukum
2015-07-24 17:38 ` Eugene Shatokhin
2015-07-24 17:38 ` Eugene Shatokhin
2015-07-27 12:29 ` Oliver Neukum
2015-07-27 13:53 ` Eugene Shatokhin
2015-07-21 13:07 ` Oliver Neukum
2015-07-21 14:22 ` Oliver Neukum
2015-07-21 14:22 ` Oliver Neukum
2015-07-22 18:33 ` Eugene Shatokhin
2015-07-23 9:15 ` Oliver Neukum
2015-07-24 14:41 ` Eugene Shatokhin
2015-07-27 10:00 ` Oliver Neukum
2015-07-27 14:23 ` Eugene Shatokhin
2015-08-14 16:55 ` Eugene Shatokhin
2015-08-14 16:58 ` [PATCH] usbnet: Fix two races between usbnet_stop() and the BH Eugene Shatokhin
2015-08-19 1:54 ` David Miller
2015-08-19 7:57 ` Eugene Shatokhin [this message]
2015-08-19 7:57 ` Eugene Shatokhin
2015-08-19 10:54 ` Bjørn Mork
2015-08-19 11:59 ` Eugene Shatokhin
2015-08-19 12:31 ` Bjørn Mork
2015-08-24 12:20 ` Eugene Shatokhin
2015-08-24 13:29 ` Bjørn Mork
2015-08-24 17:00 ` Eugene Shatokhin
2015-08-25 12:31 ` Oliver Neukum
2015-08-24 17:43 ` David Miller
2015-08-24 18:06 ` Alan Stern
2015-08-24 18:06 ` Alan Stern
2015-08-24 18:21 ` Alan Stern
2015-08-25 12:36 ` Oliver Neukum
2015-08-24 18:35 ` David Miller
2015-08-24 18:12 ` Eugene Shatokhin
2015-07-23 9:43 ` Several races in "usbnet" module (kernel 4.1.x) Oliver Neukum
2015-07-23 9:43 ` Oliver Neukum
2015-07-23 11:39 ` Eugene Shatokhin
2015-08-24 20:13 ` [PATCH 0/2] usbnet: Fix 2 problems in usbnet_stop() Eugene Shatokhin
2015-08-24 20:13 ` [PATCH 1/2] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Eugene Shatokhin
2015-08-25 13:01 ` Oliver Neukum
2015-08-25 14:16 ` Bjørn Mork
2015-08-25 14:16 ` Bjørn Mork
2015-08-25 14:22 ` Oliver Neukum
2015-08-26 2:44 ` David Miller
2015-08-24 20:13 ` [PATCH 2/2] usbnet: Fix a race between usbnet_stop() and the BH Eugene Shatokhin
2015-08-24 21:01 ` Bjørn Mork
2015-08-28 8:09 ` Eugene Shatokhin
2015-08-28 8:55 ` Bjørn Mork
2015-08-28 10:42 ` Eugene Shatokhin
2015-08-31 7:32 ` Bjørn Mork
2015-08-31 8:50 ` Eugene Shatokhin
2015-09-01 7:58 ` Oliver Neukum
2015-09-01 13:54 ` Eugene Shatokhin
2015-09-01 14:05 ` [PATCH] " Eugene Shatokhin
2015-09-08 7:24 ` Eugene Shatokhin
2015-09-08 7:37 ` Bjørn Mork
2015-09-08 7:48 ` Oliver Neukum
2015-09-08 20:18 ` David Miller
2015-09-01 7:57 ` [PATCH 2/2] " Oliver Neukum
2015-08-26 2:45 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55D436D5.6010105@rosalab.ru \
--to=eugene.shatokhin@rosalab.ru \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oneukum@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.