From: Stephen Hemminger <stephen@networkplumber.org>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
bridge@lists.linux-foundation.org, davem@davemloft.net,
srn@prgmr.com
Subject: Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
Date: Tue, 12 Dec 2017 10:07:13 -0800 [thread overview]
Message-ID: <20171212100713.6c24c9c3@xeon-e3> (raw)
In-Reply-To: <1513087370-4791-1-git-send-email-nikolay@cumulusnetworks.com>
On Tue, 12 Dec 2017 16:02:50 +0200
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> Before this patch the bridge used a fixed 256 element hash table which
> was fine for small use cases (in my tests it starts to degrade
> above 1000 entries), but it wasn't enough for medium or large
> scale deployments. Modern setups have thousands of participants in a
> single bridge, even only enabling vlans and adding a few thousand vlan
> entries will cause a few thousand fdbs to be automatically inserted per
> participating port. So we need to scale the fdb table considerably to
> cope with modern workloads, and this patch converts it to use a
> rhashtable for its operations thus improving the bridge scalability.
> Tests show the following results (10 runs each), at up to 1000 entries
> rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it
> is 2 times faster and at 30000 it is 50 times faster.
> Obviously this happens because of the properties of the two constructs
> and is expected, rhashtable keeps pretty much a constant time even with
> 10000000 entries (tested), while the fixed hash table struggles
> considerably even above 10000.
> As a side effect this also reduces the net_bridge struct size from 3248
> bytes to 1344 bytes. Also note that the key struct is 8 bytes.
>
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
Thanks for doing this, it was on my list of things that never get done.
Some downsides:
* size of the FDB entry gets larger.
* you lost the ability to salt the hash (and rekey) which is important
for DDoS attacks
* being slower for small (<10 entries) also matters and is is a common
use case for containers.
WARNING: multiple messages have this Message-ID (diff)
From: Stephen Hemminger <stephen@networkplumber.org>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
roopa@cumulusnetworks.com, srn@prgmr.com, davem@davemloft.net
Subject: Re: [PATCH net-next] net: bridge: use rhashtable for fdbs
Date: Tue, 12 Dec 2017 10:07:13 -0800 [thread overview]
Message-ID: <20171212100713.6c24c9c3@xeon-e3> (raw)
In-Reply-To: <1513087370-4791-1-git-send-email-nikolay@cumulusnetworks.com>
On Tue, 12 Dec 2017 16:02:50 +0200
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> Before this patch the bridge used a fixed 256 element hash table which
> was fine for small use cases (in my tests it starts to degrade
> above 1000 entries), but it wasn't enough for medium or large
> scale deployments. Modern setups have thousands of participants in a
> single bridge, even only enabling vlans and adding a few thousand vlan
> entries will cause a few thousand fdbs to be automatically inserted per
> participating port. So we need to scale the fdb table considerably to
> cope with modern workloads, and this patch converts it to use a
> rhashtable for its operations thus improving the bridge scalability.
> Tests show the following results (10 runs each), at up to 1000 entries
> rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it
> is 2 times faster and at 30000 it is 50 times faster.
> Obviously this happens because of the properties of the two constructs
> and is expected, rhashtable keeps pretty much a constant time even with
> 10000000 entries (tested), while the fixed hash table struggles
> considerably even above 10000.
> As a side effect this also reduces the net_bridge struct size from 3248
> bytes to 1344 bytes. Also note that the key struct is 8 bytes.
>
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
Thanks for doing this, it was on my list of things that never get done.
Some downsides:
* size of the FDB entry gets larger.
* you lost the ability to salt the hash (and rekey) which is important
for DDoS attacks
* being slower for small (<10 entries) also matters and is is a common
use case for containers.
next prev parent reply other threads:[~2017-12-12 18:07 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-12 14:02 [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs Nikolay Aleksandrov
2017-12-12 14:02 ` Nikolay Aleksandrov
2017-12-12 18:02 ` [Bridge] " Stephen Hemminger
2017-12-12 18:02 ` Stephen Hemminger
2017-12-12 18:18 ` [Bridge] " Nikolay Aleksandrov
2017-12-12 18:18 ` Nikolay Aleksandrov
2017-12-12 18:07 ` Stephen Hemminger [this message]
2017-12-12 18:07 ` Stephen Hemminger
2017-12-12 18:16 ` [Bridge] " Nikolay Aleksandrov
2017-12-12 18:16 ` Nikolay Aleksandrov
2017-12-13 20:10 ` [Bridge] " David Miller
2017-12-13 20:10 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171212100713.6c24c9c3@xeon-e3 \
--to=stephen@networkplumber.org \
--cc=bridge@lists.linux-foundation.org \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=roopa@cumulusnetworks.com \
--cc=srn@prgmr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.