From: ebiederm@xmission.com (Eric W. Biederman)
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
Rolf Neugebauer <rolf.neugebauer@docker.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Justin Cormack <justin.cormack@docker.com>,
Ian Campbell <ian.campbell@docker.com>, <netdev@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>
Subject: Re: Long delays creating a netns after deleting one (possibly RCU related)
Date: Mon, 14 Nov 2016 16:12:54 -0600 [thread overview]
Message-ID: <87y40lhfrt.fsf@xmission.com> (raw)
In-Reply-To: <20161114181425.GN4127@linux.vnet.ibm.com> (Paul E. McKenney's message of "Mon, 14 Nov 2016 10:14:25 -0800")
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> On Mon, Nov 14, 2016 at 09:44:35AM -0800, Cong Wang wrote:
>> On Mon, Nov 14, 2016 at 8:24 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Sun, Nov 13, 2016 at 10:47:01PM -0800, Cong Wang wrote:
>> >> On Fri, Nov 11, 2016 at 4:55 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> >> > On Fri, Nov 11, 2016 at 4:23 PM, Paul E. McKenney
>> >> > <paulmck@linux.vnet.ibm.com> wrote:
>> >> >>
>> >> >> Ah! This net_mutex is different than RTNL. Should synchronize_net() be
>> >> >> modified to check for net_mutex being held in addition to the current
>> >> >> checks for RTNL being held?
>> >> >>
>> >> >
>> >> > Good point!
>> >> >
>> >> > Like commit be3fc413da9eb17cce0991f214ab0, checking
>> >> > for net_mutex for this case seems to be an optimization, I assume
>> >> > synchronize_rcu_expedited() and synchronize_rcu() have the same
>> >> > behavior...
>> >>
>> >> Thinking a bit more, I think commit be3fc413da9eb17cce0991f
>> >> gets wrong on rtnl_is_locked(), the lock could be locked by other
>> >> process not by the current one, therefore it should be
>> >> lockdep_rtnl_is_held() which, however, is defined only when LOCKDEP
>> >> is enabled... Sigh.
>> >>
>> >> I don't see any better way than letting callers decide if they want the
>> >> expedited version or not, but this requires changes of all callers of
>> >> synchronize_net(). Hm.
>> >
>> > I must confess that I don't understand how it would help to use an
>> > expedited grace period when some other process is holding RTNL.
>> > In contrast, I do well understand how it helps when the current process
>> > is holding RTNL.
>>
>> Yeah, this is exactly my point. And same for ASSERT_RTNL() which checks
>> rtnl_is_locked(), clearly we need to assert "it is held by the current process"
>> rather than "it is locked by whatever process".
>>
>> But given *_is_held() is always defined by LOCKDEP, so we probably need
>> mutex to provide such a helper directly, mutex->owner is not always defined
>> either. :-/
>
> There is always the option of making acquisition and release set a per-task
> variable that can be tested. (Where did I put that asbestos suit, anyway?)
>
> Thanx, Paul
synchronize_rcu_expidited is not enough if you have multiple network
devices in play.
Looking at the code it comes down to this commit, and it appears there
is a promise add rcu grace period combining by Eric Dumazet.
Eric since people are hitting noticable stalls because of the rcu grace
period taking a long time do you think you could look at this code path
a bit more?
commit 93d05d4a320cb16712bb3d57a9658f395d8cecb9
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Nov 18 06:31:03 2015 -0800
net: provide generic busy polling to all NAPI drivers
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()
This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.
Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.
Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric
next prev parent reply other threads:[~2016-11-14 22:15 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-09 15:42 Long delays creating a netns after deleting one (possibly RCU related) Rolf Neugebauer
2016-11-10 17:37 ` Cong Wang
2016-11-10 21:24 ` Paul E. McKenney
2016-11-11 13:11 ` Rolf Neugebauer
2016-11-12 0:23 ` Paul E. McKenney
2016-11-12 0:55 ` Cong Wang
2016-11-14 6:47 ` Cong Wang
2016-11-14 16:24 ` Paul E. McKenney
2016-11-14 17:44 ` Cong Wang
2016-11-14 18:14 ` Paul E. McKenney
2016-11-14 22:12 ` Eric W. Biederman [this message]
2016-11-14 22:46 ` Eric Dumazet
2016-11-14 23:09 ` Eric Dumazet
2016-11-18 0:31 ` Jarno Rajahalme
2016-11-19 0:38 ` Jarno Rajahalme
2016-11-19 0:41 ` Eric Dumazet
2016-11-14 17:29 ` Hannes Frederic Sowa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y40lhfrt.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=edumazet@google.com \
--cc=ian.campbell@docker.com \
--cc=justin.cormack@docker.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rolf.neugebauer@docker.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox