All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	davem@davemloft.net, akpm@linux-foundation.org,
	adobriyan@gmail.com, rui.xiang@huawei.com,
	viro@zeniv.linux.org.uk, oleg@redhat.com, gorcunov@openvz.org,
	kirill.shutemov@linux.intel.com, grant.likely@secretlab.ca,
	tytso@mit.edu, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries
Date: Wed, 15 Oct 2014 11:02:55 +0200	[thread overview]
Message-ID: <543E383F.5010907@6wind.com> (raw)
In-Reply-To: <87y4sis9wk.fsf@x220.int.ebiederm.org>

Le 14/10/2014 21:56, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> Le 07/10/2014 11:02, Nicolas Dichtel a écrit :
>>> The current implementation for the directories in /proc is using a single
>>> linked list. This is slow when handling directories with large numbers of
>>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>>
>>> This patch replaces this linked list by a red-black tree.
>>>
>>> Here are some numbers:
>>>
>>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>>
>>> Before the patch:
>>> $ time ip -b dummy30000.batch
>>> real	2m31.950s
>>> user	0m0.440s
>>> sys	2m21.440s
>>> $ time rmmod dummy
>>> real	1m35.764s
>>> user	0m0.000s
>>> sys	1m24.088s
>>>
>>> After the patch:
>>> $ time ip -b dummy30000.batch
>>> real	2m0.874s
>>> user	0m0.448s
>>> sys	1m49.720s
>>> $ time rmmod dummy
>>> real	1m13.988s
>>> user	0m0.000s
>>> sys	1m1.008s
>>>
>>> The idea of improving this part was suggested by
>>> Thierry Herbelot <thierry.herbelot@6wind.com>.
>>>
>>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>> Acked-by: David S. Miller <davem@davemloft.net>
> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>> ---
>>
>> I'm not sure who is in charge of taking this patch. Should I resend it to
>> someone else or is it already included in a tree?
>
> There are a couple of things going on here.
>
> This patch came out at the beginning of the merge window which is a time
> when everything that was ready and well tested ahead of time gets
> merged.
>
> Your numbers don't look too bad, so I expect this code is ready to go
> (although I am a smidge disappointed in the small size of the
> performance improvement), my quick read through earlier did not show
> anything scary.   Do you have any idea why going from O(N^2) algorithm
> to a O(NlogN) algorithm showed such a small performance improvement with
> 30,000 entries?
perf top shows that a lot of time was wasted in vsscanf().
With my previous test file (dummy30000.batch), kernel needs to calculate
the interface name (iproute2 command was : 'link add type dummy'). Here is
another bench:

Files dummywithnameX.batch are created like this:
for i in `seq 1 X`; do echo "link add dummy$i type dummy" >> 
dummywithnameX.batch; done

Before the patch:
$ time ip -b dummywithname10000.batch
real	0m22.496s
user	0m0.196s
sys	0m5.924s
$ rmmod dummy
$ time ip -b dummywithname15000.batch
real	0m37.903s
user	0m0.348s
sys	0m13.160s
$ rmmod dummy
$ time ip -b dummywithname20000.batch
real	0m52.941s
user	0m0.396s
sys	0m20.164s
$ rmmod dummy
$ time ip -b dummywithname30000.batch
real	1m32.447s
user	0m0.660s
sys	0m43.436s

After the patch:
$ time ip -b dummywithname10000.batch
real	0m19.648s
user	0m0.180s
sys	0m2.260s
$ rmmod dummy
$ time ip -b dummywithname15000.batch
real	0m29.149s
user	0m0.256s
sys	0m3.616s
$ rmmod dummy
$ time ip -b dummywithname20000.batch
real	0m39.133s
user	0m0.440s
sys	0m4.756s
$ rmmod dummy
$ time ip -b dummywithname30000.batch
real    0m59.837s
user    0m0.520s
sys     0m7.780s

Thus an improvement of ~35% for 30k ifaces, but more importantly, after the
patch, it scales linearly.

perf top output after the patch:
      4.30%  [kernel]          [k] arch_local_irq_restore
      2.92%  [kernel]          [k] format_decode
      2.10%  [kernel]          [k] vsnprintf
      2.08%  [kernel]          [k] arch_local_irq_enable
      1.82%  [kernel]          [k] number.isra.1
      1.81%  [kernel]          [k] arch_local_irq_enable
      1.41%  [kernel]          [k] kmem_cache_alloc
      1.33%  [kernel]          [k] unmap_single_vma
      1.10%  [kernel]          [k] do_raw_spin_lock
      1.09%  [kernel]          [k] clear_page
      1.00%  [kernel]          [k] arch_local_irq_enable

>
> Normally proc is looked at by a group of folks me, Alexey Dobriyan, and
> Al Viro all sort of tag team taking care of the proc infrastructure with
> (except for Al) Andrew Morton typically taking the patches and merging
> them.
>
> I am currently in the middle of a move so I don't have the time to carry
> this change or do much of anything until I am settled again.
>
> What I would recommend is verifying your patch works against v3.18-rc1
> at the begginning of next week and sending the code to Andrew Morton.
Ok, thank you. I will do this.

Nicolas

  reply	other threads:[~2014-10-15  9:03 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-03 13:28 [PATCH net-next] dev: add support of flag IFF_NOPROC Nicolas Dichtel
2013-10-03 13:30 ` [PATCH iproute2 net-next-3.11] ip: add support of link " Nicolas Dichtel
2013-10-03 17:46 ` [PATCH net-next] dev: add support of " Stephen Hemminger
2013-10-03 19:09   ` David Miller
2013-10-04 12:07     ` Nicolas Dichtel
2013-10-04 17:29       ` David Miller
2014-10-02 15:24     ` [RFC PATCH linux 0/2] Optimize network interfaces creation Nicolas Dichtel
2014-10-02 15:25       ` [RFC PATCH linux 1/2] proc_net: declare /proc/net as a directory Nicolas Dichtel
2014-10-02 15:25       ` [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries Nicolas Dichtel
2014-10-02 16:46         ` Stephen Hemminger
2014-10-03 13:10           ` Nicolas Dichtel
2014-10-02 17:28         ` Alexey Dobriyan
2014-10-03 13:07           ` Nicolas Dichtel
2014-10-02 18:01         ` Eric W. Biederman
2014-10-02 20:06           ` Alexey Dobriyan
2014-10-02 21:07             ` Eric W. Biederman
2014-10-02 21:27               ` Stephen Hemminger
2014-10-03  7:28               ` Nicolas Dichtel
2014-10-03 13:09           ` Nicolas Dichtel
2014-10-06 14:30             ` [PATCH linux v2 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-06 14:30               ` [PATCH linux v2 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-06 22:14                 ` David Miller
2014-10-07  9:02                   ` [PATCH linux v3 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-07  9:02                     ` [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-13 11:14                       ` Nicolas Dichtel
2014-10-14 19:30                         ` David Miller
2014-10-14 19:56                         ` Eric W. Biederman
2014-10-15  9:02                           ` Nicolas Dichtel [this message]
2014-10-15 21:37                       ` Andrew Morton
2014-10-03 10:55         ` [RFC PATCH linux 2/2] fs/proc: use a hash table " Alexey Dobriyan
2014-10-03 13:07           ` Nicolas Dichtel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=543E383F.5010907@6wind.com \
    --to=nicolas.dichtel@6wind.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=gorcunov@openvz.org \
    --cc=grant.likely@secretlab.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=rui.xiang@huawei.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.