From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
davem@davemloft.net, akpm@linux-foundation.org,
adobriyan@gmail.com, rui.xiang@huawei.com,
viro@zeniv.linux.org.uk, oleg@redhat.com, gorcunov@openvz.org,
kirill.shutemov@linux.intel.com, grant.likely@secretlab.ca,
tytso@mit.edu, Thierry Herbelot <thierry.herbelot@6wind.com>
Subject: Re: [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries
Date: Fri, 03 Oct 2014 15:09:29 +0200 [thread overview]
Message-ID: <542EA009.4060009@6wind.com> (raw)
In-Reply-To: <87h9zmpcz5.fsf@x220.int.ebiederm.org>
Le 02/10/2014 20:01, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> From: Thierry Herbelot <thierry.herbelot@6wind.com>
>>
>> The current implementation for the directories in /proc is using a single
>> linked list. This is slow when handling directories with large numbers of
>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>
>> This patch enables multiple linked lists. A hash based on the entry name is
>> used to select the linked list for one given entry.
>>
>> The speed creation of netdevices is faster as shorter linked lists must be
>> scanned when adding a new netdevice.
>
> Is the directory of primary concern /proc/net/dev/snmp6 ?
Yes.
>
> Unless I have configured my networking stack weird by mistake that
> is the only directory under /proc/net that grows when we add an
> interface.
>
> I just want to make certain I am seeing the same things that you are
> seeing.
>
> I feel silly for overlooking this directory when the rest of the
> scalability work was done.
>
>> Here are some numbers:
>>
>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>
>> Before the patch:
>> time ip -b dummy30000.batch
>> real 2m32.221s
>> user 0m0.380s
>> sys 2m30.610s
>>
>> After the patch:
>> time ip -b dummy30000.batch
>> real 1m57.190s
>> user 0m0.350s
>> sys 1m56.120s
>>
>> The single 'subdir' list head is replaced by a subdir hash table. The subdir
>> hash buckets are only allocated for directories. The number of hash buckets
>> is a compile-time parameter.
>
> That looks like a nice speed up. A couple of things.
>
> With sysfs and sysctl when faced this class of challenge we used an
> rbtree instead of a hash table. That should use less memory and scale
> better.
>
> I am concerned about a fixed sized hash table moving the location where
> we fall off a cliff but not removing the cliff itself.
>
> I suppose it would be possible to use the new fancy resizable hash
> tables but previous work on sysctl and sysfs suggests that we don't look
> up these entries sufficiently to require a hash table. We just need a
> data structure that doesn't fall over at scale, and the rbtrees seem to
> do that very nicely.
Ok, I will have a look at it.
Thank you,
Nicolas
next prev parent reply other threads:[~2014-10-03 13:09 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-03 13:28 [PATCH net-next] dev: add support of flag IFF_NOPROC Nicolas Dichtel
2013-10-03 13:30 ` [PATCH iproute2 net-next-3.11] ip: add support of link " Nicolas Dichtel
2013-10-03 17:46 ` [PATCH net-next] dev: add support of " Stephen Hemminger
2013-10-03 19:09 ` David Miller
2013-10-04 12:07 ` Nicolas Dichtel
2013-10-04 17:29 ` David Miller
2014-10-02 15:24 ` [RFC PATCH linux 0/2] Optimize network interfaces creation Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 1/2] proc_net: declare /proc/net as a directory Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries Nicolas Dichtel
2014-10-02 16:46 ` Stephen Hemminger
2014-10-03 13:10 ` Nicolas Dichtel
2014-10-02 17:28 ` Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
2014-10-02 18:01 ` Eric W. Biederman
2014-10-02 20:06 ` Alexey Dobriyan
2014-10-02 21:07 ` Eric W. Biederman
2014-10-02 21:27 ` Stephen Hemminger
2014-10-03 7:28 ` Nicolas Dichtel
2014-10-03 13:09 ` Nicolas Dichtel [this message]
2014-10-06 14:30 ` [PATCH linux v2 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-06 14:30 ` [PATCH linux v2 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-06 22:14 ` David Miller
2014-10-07 9:02 ` [PATCH linux v3 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-07 9:02 ` [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-13 11:14 ` Nicolas Dichtel
2014-10-14 19:30 ` David Miller
2014-10-14 19:56 ` Eric W. Biederman
2014-10-15 9:02 ` Nicolas Dichtel
2014-10-15 21:37 ` Andrew Morton
2014-10-03 10:55 ` [RFC PATCH linux 2/2] fs/proc: use a hash table " Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=542EA009.4060009@6wind.com \
--to=nicolas.dichtel@6wind.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=grant.likely@secretlab.ca \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=rui.xiang@huawei.com \
--cc=thierry.herbelot@6wind.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).