All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
	bridge@lists.linux-foundation.org, davem@davemloft.net,
	srn@prgmr.com
Subject: Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
Date: Tue, 12 Dec 2017 10:07:13 -0800	[thread overview]
Message-ID: <20171212100713.6c24c9c3@xeon-e3> (raw)
In-Reply-To: <1513087370-4791-1-git-send-email-nikolay@cumulusnetworks.com>

On Tue, 12 Dec 2017 16:02:50 +0200
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> Before this patch the bridge used a fixed 256 element hash table which
> was fine for small use cases (in my tests it starts to degrade
> above 1000 entries), but it wasn't enough for medium or large
> scale deployments. Modern setups have thousands of participants in a
> single bridge, even only enabling vlans and adding a few thousand vlan
> entries will cause a few thousand fdbs to be automatically inserted per
> participating port. So we need to scale the fdb table considerably to
> cope with modern workloads, and this patch converts it to use a
> rhashtable for its operations thus improving the bridge scalability.
> Tests show the following results (10 runs each), at up to 1000 entries
> rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it
> is 2 times faster and at 30000 it is 50 times faster.
> Obviously this happens because of the properties of the two constructs
> and is expected, rhashtable keeps pretty much a constant time even with
> 10000000 entries (tested), while the fixed hash table struggles
> considerably even above 10000.
> As a side effect this also reduces the net_bridge struct size from 3248
> bytes to 1344 bytes. Also note that the key struct is 8 bytes.
> 
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---

Thanks for doing this, it was on my list of things that never get done.

Some downsides:
 * size of the FDB entry gets larger.
 * you lost the ability to salt the hash (and rekey) which is important
   for DDoS attacks
 * being slower for small (<10 entries) also matters and is is a common
   use case for containers.



WARNING: multiple messages have this Message-ID (diff)
From: Stephen Hemminger <stephen@networkplumber.org>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
	roopa@cumulusnetworks.com, srn@prgmr.com, davem@davemloft.net
Subject: Re: [PATCH net-next] net: bridge: use rhashtable for fdbs
Date: Tue, 12 Dec 2017 10:07:13 -0800	[thread overview]
Message-ID: <20171212100713.6c24c9c3@xeon-e3> (raw)
In-Reply-To: <1513087370-4791-1-git-send-email-nikolay@cumulusnetworks.com>

On Tue, 12 Dec 2017 16:02:50 +0200
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> Before this patch the bridge used a fixed 256 element hash table which
> was fine for small use cases (in my tests it starts to degrade
> above 1000 entries), but it wasn't enough for medium or large
> scale deployments. Modern setups have thousands of participants in a
> single bridge, even only enabling vlans and adding a few thousand vlan
> entries will cause a few thousand fdbs to be automatically inserted per
> participating port. So we need to scale the fdb table considerably to
> cope with modern workloads, and this patch converts it to use a
> rhashtable for its operations thus improving the bridge scalability.
> Tests show the following results (10 runs each), at up to 1000 entries
> rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it
> is 2 times faster and at 30000 it is 50 times faster.
> Obviously this happens because of the properties of the two constructs
> and is expected, rhashtable keeps pretty much a constant time even with
> 10000000 entries (tested), while the fixed hash table struggles
> considerably even above 10000.
> As a side effect this also reduces the net_bridge struct size from 3248
> bytes to 1344 bytes. Also note that the key struct is 8 bytes.
> 
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---

Thanks for doing this, it was on my list of things that never get done.

Some downsides:
 * size of the FDB entry gets larger.
 * you lost the ability to salt the hash (and rekey) which is important
   for DDoS attacks
 * being slower for small (<10 entries) also matters and is is a common
   use case for containers.

  parent reply	other threads:[~2017-12-12 18:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-12 14:02 [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs Nikolay Aleksandrov
2017-12-12 14:02 ` Nikolay Aleksandrov
2017-12-12 18:02 ` [Bridge] " Stephen Hemminger
2017-12-12 18:02   ` Stephen Hemminger
2017-12-12 18:18   ` [Bridge] " Nikolay Aleksandrov
2017-12-12 18:18     ` Nikolay Aleksandrov
2017-12-12 18:07 ` Stephen Hemminger [this message]
2017-12-12 18:07   ` Stephen Hemminger
2017-12-12 18:16   ` [Bridge] " Nikolay Aleksandrov
2017-12-12 18:16     ` Nikolay Aleksandrov
2017-12-13 20:10 ` [Bridge] " David Miller
2017-12-13 20:10   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171212100713.6c24c9c3@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=srn@prgmr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.