From: ebiederm@xmission.com (Eric W. Biederman)
To: paulmck@linux.vnet.ibm.com
Cc: chiluk@canonical.com, Rafael Tinoco <rafael.tinoco@canonical.com>,
linux-kernel@vger.kernel.org, davem@davemloft.net,
Christopher Arges <chris.j.arges@canonical.com>,
Jay Vosburgh <jay.vosburgh@canonical.com>
Subject: Re: Possible netns creation and execution performance/scalability regression since v3.8 due to rcu callbacks being offloaded to multiple cpus
Date: Wed, 11 Jun 2014 16:12:15 -0700 [thread overview]
Message-ID: <87ioo7vy5s.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <20140611225228.GO4581@linux.vnet.ibm.com> (Paul E. McKenney's message of "Wed, 11 Jun 2014 15:52:28 -0700")
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> On Wed, Jun 11, 2014 at 01:46:08PM -0700, Eric W. Biederman wrote:
>> On the chance it is dropping the old nsproxy which calls syncrhonize_rcu
>> in switch_task_namespaces that is causing you problems I have attached
>> a patch that changes from rcu_read_lock to task_lock for code that
>> calls task_nsproxy from a different task. The code should be safe
>> and it should be an unquestions performance improvement but I have only
>> compile tested it.
>>
>> If you can try the patch it will tell is if the problem is the rcu
>> access in switch_task_namespaces (the only one I am aware of network
>> namespace creation) or if the problem rcu case is somewhere else.
>>
>> If nothing else knowing which rcu accesses are causing the slow down
>> seem important at the end of the day.
>>
>> Eric
>>
>
> If this is the culprit, another approach would be to use workqueues from
> RCU callbacks. The following (untested, probably does not even build)
> patch illustrates one such approach.
For reference the only reason we are using rcu_lock today for nsproxy is
an old lock ordering problem that does not exist anymore.
I can say that in some workloads setns is a bit heavy today because of
the synchronize_rcu and setns is more important that I had previously
thought because pthreads break the classic unix ability to do things in
your process after fork() (sigh).
Today daemonize is gone, and notify the parent process with a signal
relies on task_active_pid_ns which does not use nsproxy. So the old
lock ordering problem/race is gone.
The description of what was happening when the code switched from
task_lock to rcu_read_lock to protect nsproxy.
commit cf7b708c8d1d7a27736771bcf4c457b332b0f818
Author: Pavel Emelyanov <xemul@openvz.org>
Date: Thu Oct 18 23:39:54 2007 -0700
Make access to task's nsproxy lighter
When someone wants to deal with some other taks's namespaces it has to lock
the task and then to get the desired namespace if the one exists. This is
slow on read-only paths and may be impossible in some cases.
E.g. Oleg recently noticed a race between unshare() and the (sent for
review in cgroups) pid namespaces - when the task notifies the parent it
has to know the parent's namespace, but taking the task_lock() is
impossible there - the code is under write locked tasklist lock.
On the other hand switching the namespace on task (daemonize) and releasing
the namespace (after the last task exit) is rather rare operation and we
can sacrifice its speed to solve the issues above.
The access to other task namespaces is proposed to be performed
like this:
rcu_read_lock();
nsproxy = task_nsproxy(tsk);
if (nsproxy != NULL) {
/ *
* work with the namespaces here
* e.g. get the reference on one of them
* /
} / *
* NULL task_nsproxy() means that this task is
* almost dead (zombie)
* /
rcu_read_unlock();
This patch has passed the review by Eric and Oleg :) and,
of course, tested.
[clg@fr.ibm.com: fix unshare()]
[ebiederm@xmission.com: Update get_net_ns_by_pid]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric
next prev parent reply other threads:[~2014-06-11 23:13 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-11 5:52 Possible netns creation and execution performance/scalability regression since v3.8 due to rcu callbacks being offloaded to multiple cpus Rafael Tinoco
2014-06-11 7:07 ` Eric W. Biederman
2014-06-11 13:39 ` Paul E. McKenney
2014-06-11 15:17 ` Rafael Tinoco
2014-06-11 15:46 ` David Chiluk
2014-06-11 16:18 ` Paul E. McKenney
2014-06-11 18:27 ` Dave Chiluk
2014-06-11 19:48 ` Paul E. McKenney
2014-06-11 20:55 ` Eric W. Biederman
2014-06-11 21:03 ` Rafael Tinoco
2014-06-11 20:46 ` Eric W. Biederman
2014-06-11 21:14 ` Dave Chiluk
2014-06-11 22:52 ` Paul E. McKenney
2014-06-11 23:12 ` Eric W. Biederman [this message]
2014-06-11 23:49 ` Paul E. McKenney
2014-06-12 0:14 ` Eric W. Biederman
2014-06-12 0:25 ` Rafael Tinoco
2014-06-12 1:09 ` Eric W. Biederman
2014-06-12 1:14 ` Rafael Tinoco
[not found] ` <CAJE_dJzjcWP=e_CPM1M64URVHiEFFb+fP6g2YKZVdoFntkQMZg@mail.gmail.com>
2014-06-13 18:22 ` Rafael Tinoco
2014-06-14 0:02 ` Eric W. Biederman
2014-06-16 15:01 ` Rafael Tinoco
2014-07-17 12:05 ` Rafael David Tinoco
2014-07-24 7:01 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ioo7vy5s.fsf@x220.int.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=chiluk@canonical.com \
--cc=chris.j.arges@canonical.com \
--cc=davem@davemloft.net \
--cc=jay.vosburgh@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rafael.tinoco@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox