From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
davem@davemloft.net, akpm@linux-foundation.org,
adobriyan@gmail.com, rui.xiang@huawei.com,
viro@zeniv.linux.org.uk, oleg@redhat.com, gorcunov@openvz.org,
kirill.shutemov@linux.intel.com, grant.likely@secretlab.ca,
tytso@mit.edu, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries
Date: Wed, 15 Oct 2014 11:02:55 +0200 [thread overview]
Message-ID: <543E383F.5010907@6wind.com> (raw)
In-Reply-To: <87y4sis9wk.fsf@x220.int.ebiederm.org>
Le 14/10/2014 21:56, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> Le 07/10/2014 11:02, Nicolas Dichtel a écrit :
>>> The current implementation for the directories in /proc is using a single
>>> linked list. This is slow when handling directories with large numbers of
>>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>>
>>> This patch replaces this linked list by a red-black tree.
>>>
>>> Here are some numbers:
>>>
>>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>>
>>> Before the patch:
>>> $ time ip -b dummy30000.batch
>>> real 2m31.950s
>>> user 0m0.440s
>>> sys 2m21.440s
>>> $ time rmmod dummy
>>> real 1m35.764s
>>> user 0m0.000s
>>> sys 1m24.088s
>>>
>>> After the patch:
>>> $ time ip -b dummy30000.batch
>>> real 2m0.874s
>>> user 0m0.448s
>>> sys 1m49.720s
>>> $ time rmmod dummy
>>> real 1m13.988s
>>> user 0m0.000s
>>> sys 1m1.008s
>>>
>>> The idea of improving this part was suggested by
>>> Thierry Herbelot <thierry.herbelot@6wind.com>.
>>>
>>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>> Acked-by: David S. Miller <davem@davemloft.net>
> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>> ---
>>
>> I'm not sure who is in charge of taking this patch. Should I resend it to
>> someone else or is it already included in a tree?
>
> There are a couple of things going on here.
>
> This patch came out at the beginning of the merge window which is a time
> when everything that was ready and well tested ahead of time gets
> merged.
>
> Your numbers don't look too bad, so I expect this code is ready to go
> (although I am a smidge disappointed in the small size of the
> performance improvement), my quick read through earlier did not show
> anything scary. Do you have any idea why going from O(N^2) algorithm
> to a O(NlogN) algorithm showed such a small performance improvement with
> 30,000 entries?
perf top shows that a lot of time was wasted in vsscanf().
With my previous test file (dummy30000.batch), kernel needs to calculate
the interface name (iproute2 command was : 'link add type dummy'). Here is
another bench:
Files dummywithnameX.batch are created like this:
for i in `seq 1 X`; do echo "link add dummy$i type dummy" >>
dummywithnameX.batch; done
Before the patch:
$ time ip -b dummywithname10000.batch
real 0m22.496s
user 0m0.196s
sys 0m5.924s
$ rmmod dummy
$ time ip -b dummywithname15000.batch
real 0m37.903s
user 0m0.348s
sys 0m13.160s
$ rmmod dummy
$ time ip -b dummywithname20000.batch
real 0m52.941s
user 0m0.396s
sys 0m20.164s
$ rmmod dummy
$ time ip -b dummywithname30000.batch
real 1m32.447s
user 0m0.660s
sys 0m43.436s
After the patch:
$ time ip -b dummywithname10000.batch
real 0m19.648s
user 0m0.180s
sys 0m2.260s
$ rmmod dummy
$ time ip -b dummywithname15000.batch
real 0m29.149s
user 0m0.256s
sys 0m3.616s
$ rmmod dummy
$ time ip -b dummywithname20000.batch
real 0m39.133s
user 0m0.440s
sys 0m4.756s
$ rmmod dummy
$ time ip -b dummywithname30000.batch
real 0m59.837s
user 0m0.520s
sys 0m7.780s
Thus an improvement of ~35% for 30k ifaces, but more importantly, after the
patch, it scales linearly.
perf top output after the patch:
4.30% [kernel] [k] arch_local_irq_restore
2.92% [kernel] [k] format_decode
2.10% [kernel] [k] vsnprintf
2.08% [kernel] [k] arch_local_irq_enable
1.82% [kernel] [k] number.isra.1
1.81% [kernel] [k] arch_local_irq_enable
1.41% [kernel] [k] kmem_cache_alloc
1.33% [kernel] [k] unmap_single_vma
1.10% [kernel] [k] do_raw_spin_lock
1.09% [kernel] [k] clear_page
1.00% [kernel] [k] arch_local_irq_enable
>
> Normally proc is looked at by a group of folks me, Alexey Dobriyan, and
> Al Viro all sort of tag team taking care of the proc infrastructure with
> (except for Al) Andrew Morton typically taking the patches and merging
> them.
>
> I am currently in the middle of a move so I don't have the time to carry
> this change or do much of anything until I am settled again.
>
> What I would recommend is verifying your patch works against v3.18-rc1
> at the begginning of next week and sending the code to Andrew Morton.
Ok, thank you. I will do this.
Nicolas
next prev parent reply other threads:[~2014-10-15 9:03 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-03 13:28 [PATCH net-next] dev: add support of flag IFF_NOPROC Nicolas Dichtel
2013-10-03 13:30 ` [PATCH iproute2 net-next-3.11] ip: add support of link " Nicolas Dichtel
2013-10-03 17:46 ` [PATCH net-next] dev: add support of " Stephen Hemminger
2013-10-03 19:09 ` David Miller
2013-10-04 12:07 ` Nicolas Dichtel
2013-10-04 17:29 ` David Miller
2014-10-02 15:24 ` [RFC PATCH linux 0/2] Optimize network interfaces creation Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 1/2] proc_net: declare /proc/net as a directory Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries Nicolas Dichtel
2014-10-02 16:46 ` Stephen Hemminger
2014-10-03 13:10 ` Nicolas Dichtel
2014-10-02 17:28 ` Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
2014-10-02 18:01 ` Eric W. Biederman
2014-10-02 20:06 ` Alexey Dobriyan
2014-10-02 21:07 ` Eric W. Biederman
2014-10-02 21:27 ` Stephen Hemminger
2014-10-03 7:28 ` Nicolas Dichtel
2014-10-03 13:09 ` Nicolas Dichtel
2014-10-06 14:30 ` [PATCH linux v2 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-06 14:30 ` [PATCH linux v2 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-06 22:14 ` David Miller
2014-10-07 9:02 ` [PATCH linux v3 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-07 9:02 ` [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-13 11:14 ` Nicolas Dichtel
2014-10-14 19:30 ` David Miller
2014-10-14 19:56 ` Eric W. Biederman
2014-10-15 9:02 ` Nicolas Dichtel [this message]
2014-10-15 21:37 ` Andrew Morton
2014-10-03 10:55 ` [RFC PATCH linux 2/2] fs/proc: use a hash table " Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=543E383F.5010907@6wind.com \
--to=nicolas.dichtel@6wind.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=grant.likely@secretlab.ca \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=rui.xiang@huawei.com \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.