From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
davem@davemloft.net, akpm@linux-foundation.org,
adobriyan@gmail.com, rui.xiang@huawei.com,
viro@zeniv.linux.org.uk, oleg@redhat.com, gorcunov@openvz.org,
kirill.shutemov@linux.intel.com, grant.likely@secretlab.ca,
tytso@mit.edu, Thierry Herbelot <thierry.herbelot@6wind.com>
Subject: Re: [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries
Date: Fri, 03 Oct 2014 15:09:29 +0200 [thread overview]
Message-ID: <542EA009.4060009@6wind.com> (raw)
In-Reply-To: <87h9zmpcz5.fsf@x220.int.ebiederm.org>
Le 02/10/2014 20:01, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> From: Thierry Herbelot <thierry.herbelot@6wind.com>
>>
>> The current implementation for the directories in /proc is using a single
>> linked list. This is slow when handling directories with large numbers of
>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>
>> This patch enables multiple linked lists. A hash based on the entry name is
>> used to select the linked list for one given entry.
>>
>> The speed creation of netdevices is faster as shorter linked lists must be
>> scanned when adding a new netdevice.
>
> Is the directory of primary concern /proc/net/dev/snmp6 ?
Yes.
>
> Unless I have configured my networking stack weird by mistake that
> is the only directory under /proc/net that grows when we add an
> interface.
>
> I just want to make certain I am seeing the same things that you are
> seeing.
>
> I feel silly for overlooking this directory when the rest of the
> scalability work was done.
>
>> Here are some numbers:
>>
>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>
>> Before the patch:
>> time ip -b dummy30000.batch
>> real 2m32.221s
>> user 0m0.380s
>> sys 2m30.610s
>>
>> After the patch:
>> time ip -b dummy30000.batch
>> real 1m57.190s
>> user 0m0.350s
>> sys 1m56.120s
>>
>> The single 'subdir' list head is replaced by a subdir hash table. The subdir
>> hash buckets are only allocated for directories. The number of hash buckets
>> is a compile-time parameter.
>
> That looks like a nice speed up. A couple of things.
>
> With sysfs and sysctl when faced this class of challenge we used an
> rbtree instead of a hash table. That should use less memory and scale
> better.
>
> I am concerned about a fixed sized hash table moving the location where
> we fall off a cliff but not removing the cliff itself.
>
> I suppose it would be possible to use the new fancy resizable hash
> tables but previous work on sysctl and sysfs suggests that we don't look
> up these entries sufficiently to require a hash table. We just need a
> data structure that doesn't fall over at scale, and the rbtrees seem to
> do that very nicely.
Ok, I will have a look at it.
Thank you,
Nicolas
next prev parent reply other threads:[~2014-10-03 13:09 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-03 13:28 [PATCH net-next] dev: add support of flag IFF_NOPROC Nicolas Dichtel
2013-10-03 13:30 ` [PATCH iproute2 net-next-3.11] ip: add support of link " Nicolas Dichtel
2013-10-03 17:46 ` [PATCH net-next] dev: add support of " Stephen Hemminger
2013-10-03 19:09 ` David Miller
2013-10-04 12:07 ` Nicolas Dichtel
2013-10-04 17:29 ` David Miller
2014-10-02 15:24 ` [RFC PATCH linux 0/2] Optimize network interfaces creation Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 1/2] proc_net: declare /proc/net as a directory Nicolas Dichtel
2014-10-02 15:25 ` [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries Nicolas Dichtel
2014-10-02 16:46 ` Stephen Hemminger
2014-10-03 13:10 ` Nicolas Dichtel
2014-10-02 17:28 ` Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
2014-10-02 18:01 ` Eric W. Biederman
2014-10-02 20:06 ` Alexey Dobriyan
2014-10-02 21:07 ` Eric W. Biederman
2014-10-02 21:27 ` Stephen Hemminger
2014-10-03 7:28 ` Nicolas Dichtel
2014-10-03 13:09 ` Nicolas Dichtel [this message]
2014-10-06 14:30 ` [PATCH linux v2 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-06 14:30 ` [PATCH linux v2 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-06 22:14 ` David Miller
2014-10-07 9:02 ` [PATCH linux v3 0/1] Optimize network interfaces creation Nicolas Dichtel
2014-10-07 9:02 ` [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries Nicolas Dichtel
2014-10-13 11:14 ` Nicolas Dichtel
2014-10-14 19:30 ` David Miller
2014-10-14 19:56 ` Eric W. Biederman
2014-10-15 9:02 ` Nicolas Dichtel
2014-10-15 21:37 ` Andrew Morton
2014-10-03 10:55 ` [RFC PATCH linux 2/2] fs/proc: use a hash table " Alexey Dobriyan
2014-10-03 13:07 ` Nicolas Dichtel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=542EA009.4060009@6wind.com \
--to=nicolas.dichtel@6wind.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=grant.likely@secretlab.ca \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=rui.xiang@huawei.com \
--cc=thierry.herbelot@6wind.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.