netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
To: "Bjørn Mork" <bjorn@mork.no>
Cc: David Miller <davem@davemloft.net>,
	oneukum@suse.com, netdev@vger.kernel.org,
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
Date: Mon, 24 Aug 2015 20:00:48 +0300	[thread overview]
Message-ID: <55DB4DC0.7030807@rosalab.ru> (raw)
In-Reply-To: <87io84byr8.fsf@nemi.mork.no>

24.08.2015 16:29, Bjørn Mork пишет:
> Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:
>
>> 19.08.2015 15:31, Bjørn Mork пишет:
>>> Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:
>>>
>>>> The problem is not in the reordering but rather in the fact that
>>>> "dev->flags = 0" is not necessarily atomic
>>>> w.r.t. "clear_bit(EVENT_RX_KILL, &dev->flags)", and vice versa.
>>>>
>>>> So the following might be possible, although unlikely:
>>>>
>>>> CPU0             CPU1
>>>>                    clear_bit: read dev->flags
>>>>                    clear_bit: clear EVENT_RX_KILL in the read value
>>>>
>>>> dev->flags=0;
>>>>
>>>>                    clear_bit: write updated dev->flags
>>>>
>>>> As a result, dev->flags may become non-zero again.
>>>
>>> Ah, right.  Thanks for explaining.
>>>
>>>> I cannot prove yet that this is an impossible situation. If anyone
>>>> can, please explain. If so, this part of the patch will not be needed.
>>>
>>> I wonder if we could simply move the dev->flags = 0 down a few lines to
>>> fix both issues?  It doesn't seem to do anything useful except for
>>> resetting the flags to a sane initial state after the device is down.
>>>
>>> Stopping the tasklet rescheduling etc depends only on netif_running(),
>>> which will be false when usbnet_stop is called.  There is no need to
>>> touch dev->flags for this to happen.
>>
>> That was one of the first ideas we discussed here. Unfortunately, it
>> is probably not so simple.
>>
>> Setting dev->flags to 0 makes some delayed operations do nothing and,
>> among other things, not to reschedule usbnet_bh().
>
> Yes, but I believe that is merely a side effect.  You should never need
> to clear multiple flags to get the desired behaviour.
>
>> As you can see in drivers/net/usb/usbnet.c, usbnet_bh() can be called
>> as a tasklet function and as a timer function in a number of
>> situations (look for the usage of dev->bh and dev->delay there).
>>
>> netif_running() is indeed false when usbnet_stop() runs, usbnet_stop()
>> also disables Tx. This seems to be enough for many cases where
>> usbnet_bh() is scheduled, but I am not so sure about the remaining
>> ones, namely:
>>
>> 1. A work function, usbnet_deferred_kevent(), may reschedule
>> usbnet_bh(). Looks like the workqueue is only stopped in
>> usbnet_disconnect(), so a work item might be processed while
>> usbnet_stop() works. Setting dev->flags to 0 makes the work function
>> do nothing, by the way. See also the comment in usbnet_stop() about
>> this.
>>
>> A work item may be placed to this workqueue in a number of ways, by
>> both usbnet module and the mini-drivers. It is not too easy to track
>> all these situations.
>
> That's an understatement :)
>
>
>
>> 2. rx_complete() and tx_complete() may schedule execution of
>> usbnet_bh() as a tasklet or a timer function. These two are URB
>> completion callbacks.
>>
>> It seems, new Rx and Tx URBs cannot be submitted when usbnet_stop()
>> clears dev->flags, indeed. But it does not prevent the completion
>> handlers for the previously submitted URBs from running concurrently
>> with usbnet_stop(). The latter waits for them to complete (via
>> usbnet_terminate_urbs(dev)) but only if FLAG_AVOID_UNLINK_URBS is not
>> set in info->flags. rndis_wlan, however, sets this flag for a few
>> hardware models. So - no guarantees here as well.
>
> FLAG_AVOID_UNLINK_URBS looks like it should be replaced by the newer
> ability to keep the status urb active. I believe that must have been the
> real reason for adding it, based on the commit message and the effect
> the flag will have:
>
>   commit 1487cd5e76337555737cbc55d7d83f41460d198f
>   Author: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
>   Date:   Thu Jul 30 19:41:20 2009 +0300
>
>      usbnet: allow "minidriver" to prevent urb unlinking on usbnet_stop
>
>      rndis_wlan devices freeze after running usbnet_stop several times. It appears
>      that firmware freezes in state where it does not respond to any RNDIS commands
>      and device have to be physically unplugged/replugged. This patch lets
>      minidrivers to disable unlink_urbs on usbnet_stop through new info flag.
>
>      Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
>      Cc: David Brownell <dbrownell@users.sourceforge.net>
>      Signed-off-by: John W. Linville <linville@tuxdriver.com>
>
>
>
> The rx urbs will not be resubmitted in any case, and there are of course
> no tx urbs being submitted.  So the only effect of this flag is on the
> status/interrupt urb, which I can imagine some RNDIS devices wants
> active all the time.
>
> So FLAG_AVOID_UNLINK_URBS should probably be removed and replaced calls
> to usbnet_status_start() and usbnet_status_stop().  This will require
> testing on some of the devices with the original firmware problem
> however.
>
> In any case: I do not think this flag should be considered when trying
> to make usbnet_stop behaviour saner.  It's only purpose is to
> deliberately break usbnet_stop by not actually stopping.
>
>
>> If someone could list the particular bits of dev->flags that should be
>> cleared to make sure no deferred call could reschedule usbnet_bh(),
>> etc... Well, it would be enough to clear these first and use
>> dev->flags = 0 later, after tasklet_kill() and del_timer_sync(). I
>> cannot point out these particular bits now.
>
>
> I don't think any of the flags must be cleared.  The sequence
>
>          dev_close(dev->net);
> 	usbnet_terminate_urbs(dev);
> 	usbnet_purge_paused_rxq(dev);
> 	del_timer_sync (&dev->delay);
> 	tasklet_kill (&dev->bh);
>
> should prevent any rescheduling of usbnet_bh

If so, then, I suppose, one could ignore that FLAG_AVOID_UNLINK_URBS for 
now and just move dev->flags = 0 down as you suggested and as we thought 
before.

The patch will become simpler, indeed.

>
>> Besides, it is possible, although unlikely, that new event bits will
>> be added to dev->flags in the future. And one will need to keep track
>> of these to see if they should be cleared as well. I'd prever to play
>> safer for now and clear them all.
>
> I don't think we should ever make a flag which will _have_ to be reset
> for usbnet_stop.  The only reason for clearing all flags is to reset the
> state before the next open.
>
> Yes, I see that we currently need to clear EVENT_DEV_OPEN in
> usbnet_stop, but I really don't see what this flag gives us which isn't
> already provided by netif_running().  It looks like a duplicate.
>

Regards,
Eugene

  reply	other threads:[~2015-08-24 17:00 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-20 18:13 Several races in "usbnet" module (kernel 4.1.x) Eugene Shatokhin
2015-07-21 12:04 ` Oliver Neukum
     [not found]   ` <1437480243.3823.5.camel-IBi9RG/b67k@public.gmane.org>
2015-07-24 17:38     ` Eugene Shatokhin
2015-07-27 12:29       ` Oliver Neukum
2015-07-27 13:53         ` Eugene Shatokhin
2015-07-21 13:07 ` Oliver Neukum
     [not found] ` <55AD3A41.2040100-irhHPgl+04UvJsYlp49lxw@public.gmane.org>
2015-07-21 14:22   ` Oliver Neukum
2015-07-22 18:33     ` Eugene Shatokhin
2015-07-23  9:15       ` Oliver Neukum
2015-07-24 14:41         ` Eugene Shatokhin
2015-07-27 10:00           ` Oliver Neukum
2015-07-27 14:23             ` Eugene Shatokhin
2015-08-14 16:55     ` Eugene Shatokhin
2015-08-14 16:58       ` [PATCH] usbnet: Fix two races between usbnet_stop() and the BH Eugene Shatokhin
2015-08-19  1:54         ` David Miller
     [not found]           ` <20150818.185407.1667358232705414236.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2015-08-19  7:57             ` Eugene Shatokhin
2015-08-19 10:54               ` Bjørn Mork
2015-08-19 11:59                 ` Eugene Shatokhin
2015-08-19 12:31                   ` Bjørn Mork
2015-08-24 12:20                     ` Eugene Shatokhin
2015-08-24 13:29                       ` Bjørn Mork
2015-08-24 17:00                         ` Eugene Shatokhin [this message]
2015-08-25 12:31                         ` Oliver Neukum
2015-08-24 17:43                   ` David Miller
     [not found]                     ` <20150824.104328.554582952440857559.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2015-08-24 18:06                       ` Alan Stern
2015-08-24 18:21                         ` Alan Stern
2015-08-25 12:36                           ` Oliver Neukum
2015-08-24 18:35                         ` David Miller
2015-08-24 18:12                     ` Eugene Shatokhin
2015-07-23  9:43   ` Several races in "usbnet" module (kernel 4.1.x) Oliver Neukum
2015-07-23 11:39     ` Eugene Shatokhin
2015-08-24 20:13 ` [PATCH 0/2] usbnet: Fix 2 problems in usbnet_stop() Eugene Shatokhin
2015-08-24 20:13   ` [PATCH 1/2] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Eugene Shatokhin
2015-08-25 13:01     ` Oliver Neukum
     [not found]       ` <1440507709.13824.6.camel-IBi9RG/b67k@public.gmane.org>
2015-08-25 14:16         ` Bjørn Mork
2015-08-25 14:22     ` Oliver Neukum
2015-08-26  2:44     ` David Miller
2015-08-24 20:13   ` [PATCH 2/2] usbnet: Fix a race between usbnet_stop() and the BH Eugene Shatokhin
2015-08-24 21:01     ` Bjørn Mork
2015-08-28  8:09       ` Eugene Shatokhin
2015-08-28  8:55         ` Bjørn Mork
2015-08-28 10:42           ` Eugene Shatokhin
2015-08-31  7:32             ` Bjørn Mork
2015-08-31  8:50               ` Eugene Shatokhin
2015-09-01  7:58                 ` Oliver Neukum
2015-09-01 13:54                   ` Eugene Shatokhin
2015-09-01 14:05                   ` [PATCH] " Eugene Shatokhin
2015-09-08  7:24                     ` Eugene Shatokhin
2015-09-08  7:37                       ` Bjørn Mork
2015-09-08  7:48                         ` Oliver Neukum
2015-09-08 20:18                     ` David Miller
2015-09-01  7:57         ` [PATCH 2/2] " Oliver Neukum
2015-08-26  2:45     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55DB4DC0.7030807@rosalab.ru \
    --to=eugene.shatokhin@rosalab.ru \
    --cc=bjorn@mork.no \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oneukum@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).