From: Andrei Vagin <avagin@virtuozzo.com>
To: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: davem@davemloft.net, vyasevic@redhat.com,
kstewart@linuxfoundation.org, pombredanne@nexb.com,
vyasevich@gmail.com, mark.rutland@arm.com,
gregkh@linuxfoundation.org, adobriyan@gmail.com, fw@strlen.de,
nicolas.dichtel@6wind.com, xiyou.wangcong@gmail.com,
roman.kapl@sysgo.com, paul@paul-moore.com, dsahern@gmail.com,
daniel@iogearbox.net, lucien.xin@gmail.com,
mschiffer@universe-factory.net, rshearma@brocade.com,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
ebiederm@xmission.com, gorcunov@virtuozzo.com
Subject: Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit
Date: Tue, 14 Nov 2017 09:44:55 -0800 [thread overview]
Message-ID: <20171114174454.GA11452@outlook.office365.com> (raw)
In-Reply-To: <151066759055.14465.9783879083192000862.stgit@localhost.localdomain>
On Tue, Nov 14, 2017 at 04:53:33PM +0300, Kirill Tkhai wrote:
> Curently mutex is used to protect pernet operations list. It makes
> cleanup_net() to execute ->exit methods of the same operations set,
> which was used on the time of ->init, even after net namespace is
> unlinked from net_namespace_list.
>
> But the problem is it's need to synchronize_rcu() after net is removed
> from net_namespace_list():
>
> Destroy net_ns:
> cleanup_net()
> mutex_lock(&net_mutex)
> list_del_rcu(&net->list)
> synchronize_rcu() <--- Sleep there for ages
> list_for_each_entry_reverse(ops, &pernet_list, list)
> ops_exit_list(ops, &net_exit_list)
> list_for_each_entry_reverse(ops, &pernet_list, list)
> ops_free_list(ops, &net_exit_list)
> mutex_unlock(&net_mutex)
>
> This primitive is not fast, especially on the systems with many processors
> and/or when preemptible RCU is enabled in config. So, all the time, while
> cleanup_net() is waiting for RCU grace period, creation of new net namespaces
> is not possible, the tasks, who makes it, are sleeping on the same mutex:
>
> Create net_ns:
> copy_net_ns()
> mutex_lock_killable(&net_mutex) <--- Sleep there for ages
>
> The solution is to convert net_mutex to the rw_semaphore. Then,
> pernet_operations::init/::exit methods, modifying the net-related data,
> will require down_read() locking only, while down_write() will be used
> for changing pernet_list.
>
> This gives signify performance increase, like you may see below. There
> is measured sequential net namespace creation in a cycle, in single
> thread, without other tasks (single user mode):
>
> 1)int main(int argc, char *argv[])
> {
> unsigned nr;
> if (argc < 2) {
> fprintf(stderr, "Provide nr iterations arg\n");
> return 1;
> }
> nr = atoi(argv[1]);
> while (nr-- > 0) {
> if (unshare(CLONE_NEWNET)) {
> perror("Can't unshare");
> return 1;
> }
> }
> return 0;
> }
>
> Origin, 100000 unshare():
> 0.03user 23.14system 1:39.85elapsed 23%CPU
>
> Patched, 100000 unshare():
> 0.03user 67.49system 1:08.34elapsed 98%CPU
>
> 2)for i in {1..10000}; do unshare -n bash -c exit; done
Hi Kirill,
This mutex has another role. You know that net namespaces are destroyed
asynchronously, and the net mutex gurantees that a backlog will be not
big. If we have something in backlog, we know that it will be handled
before creating a new net ns.
As far as I remember net namespaces are created much faster than
they are destroyed, so with this changes we can create a really big
backlog, can't we?
There was a discussion a few month ago:
https://lists.onap.org/pipermail/containers/2016-October/037509.html
>
> Origin:
> real 1m24,190s
> user 0m6,225s
> sys 0m15,132s
Here you measure time of creating and destroying net namespaces.
>
> Patched:
> real 0m18,235s (4.6 times faster)
> user 0m4,544s
> sys 0m13,796s
But here you measure time of crearing namespaces and you know nothing
when they will be destroyed.
Thanks,
Andrei
next prev parent reply other threads:[~2017-11-14 17:45 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-14 13:53 [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit Kirill Tkhai
2017-11-14 17:07 ` Eric Dumazet
2017-11-14 17:25 ` Kirill Tkhai
2017-11-14 17:44 ` Andrei Vagin [this message]
2017-11-14 18:00 ` Eric Dumazet
2017-11-14 22:15 ` Andrei Vagin
2017-11-14 18:04 ` Kirill Tkhai
2017-11-14 18:38 ` Andrei Vagin
2017-11-14 20:43 ` Kirill Tkhai
2017-11-14 18:11 ` Stephen Hemminger
2017-11-14 19:07 ` Eric Dumazet
2017-11-14 18:39 ` Cong Wang
2017-11-14 19:58 ` Kirill Tkhai
2017-11-15 3:19 ` Eric W. Biederman
2017-11-15 9:51 ` Kirill Tkhai
2017-11-15 12:36 ` Kirill Tkhai
2017-11-15 16:31 ` Eric W. Biederman
2017-11-17 18:36 ` Kirill Tkhai
2017-11-17 18:52 ` Eric W. Biederman
2017-11-17 20:16 ` Kirill Tkhai
2017-11-15 6:25 ` Eric W. Biederman
2017-11-15 9:49 ` Kirill Tkhai
2017-11-15 16:29 ` Eric W. Biederman
2017-11-16 9:13 ` Kirill Tkhai
2017-11-17 16:46 ` Kirill Tkhai
2017-11-17 18:54 ` Eric W. Biederman
2017-11-17 20:12 ` Kirill Tkhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171114174454.GA11452@outlook.office365.com \
--to=avagin@virtuozzo.com \
--cc=adobriyan@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=ebiederm@xmission.com \
--cc=fw@strlen.de \
--cc=gorcunov@virtuozzo.com \
--cc=gregkh@linuxfoundation.org \
--cc=kstewart@linuxfoundation.org \
--cc=ktkhai@virtuozzo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lucien.xin@gmail.com \
--cc=mark.rutland@arm.com \
--cc=mschiffer@universe-factory.net \
--cc=netdev@vger.kernel.org \
--cc=nicolas.dichtel@6wind.com \
--cc=paul@paul-moore.com \
--cc=pombredanne@nexb.com \
--cc=roman.kapl@sysgo.com \
--cc=rshearma@brocade.com \
--cc=vyasevic@redhat.com \
--cc=vyasevich@gmail.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).