netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: David Miller <davem@davemloft.net>, alexander.h.duyck@redhat.com
Cc: netdev@vger.kernel.org
Subject: Re: [RFC PATCH 00/29] Phase 2 of fib_trie updates
Date: Tue, 24 Feb 2015 21:12:48 -0800	[thread overview]
Message-ID: <54ED59D0.9010401@gmail.com> (raw)
In-Reply-To: <20150224.225300.107506639649235664.davem@davemloft.net>

On 02/24/2015 07:53 PM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Tue, 24 Feb 2015 12:47:55 -0800
>
>> This patch series implements the second phase of the fib_trie changes.  I
>> presented on these and the previous changes at Netdev01 and netconf.  The
>> slides for the Netdev01 presentation can be found at
>> https://www.netdev01.org/docs/duyck-fib-trie.pdf.
>>
>> I'm currently debating if I should just submit the entire patch-set as-is
>> or if I should hold off on submitting the last 10 patches as they currently
>> have a potential performance impact in the case of a large number of
>> entries placed in the local table.  Specifically I have seen that removing
>> an interface in the case of 8K local subnets being configured on it
>> resulted in the time for a dummy interface being removed increasing from
>> about .6 seconds to 2.4 seconds.  I am not sure how common of a use-case
>> something like this would be.  I have not seen the same issue if I assign
>> 8K routes to the interface as I believe the fib_table_flush aggregates them
>> all in to one resize action.
>>
>> The entire series reduces the total look-up time by another 20-35% versus
>> what is currently in the 4.0-rc1 kernel.  So for example a set of routing
>> look-ups which took 140ns in the 4.0-rc1 kernel will now only take about
>> 105ns after these patches.
> I did a quick once-over for these changes and conceptually they look
> fine.
>
> Why are sequences of removals so much more costly now?  Is it because
> of the maintainence of the information in the parent when rebalancing?
>
> In any event, I'll say two things:
>
> 1) You should submit these changes in smaller batches anyways.
>    It's easier to review and get small sets of transformations
>    tested as a unit.

Yeah, these will probably be submitted as 3 sets.  The first being the
leaf_info removal, then the key_vector stuff, and finally reworking the
RCU and pushing everything up one level so the pointer and key info
occupy the same cache line.

> 2) For the device removal case, we can batch the inet addr removal
>    based route delete operations, and thus mitigate the rebalancing
>    costs.

The problem is that the tnodes are now split over 2 cache lines.  As a
result in order to resize a node, or replace it with the leaf contained
in the node you end up having to replace the parent of the node as well. 

As it turns out dropping a subnet from the local trie occurs in two
steps.  The first appears to drop the broadcast addresses and flush
them, this is causing some significant overhead since it means the
kernel to reallocate the 8K child tnode as each subnet/child is
collapsing from a 4 child tnode to just a leaf.  Then it looks like the
kernel is going though and deleting the local addresses that were there
for each subnet one at a time.  This was much cheaper in the old setup
since it was just a matter of swapping a pointer instead of having to
update a pointer and key information.

- Alex

  reply	other threads:[~2015-02-25  5:12 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24 20:47 [RFC PATCH 00/29] Phase 2 of fib_trie updates Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 01/29] fib_trie: Convert fib_alias to hlist from list Alexander Duyck
2015-02-24 21:51   ` Or Gerlitz
2015-02-24 21:52     ` Or Gerlitz
2015-02-24 22:08     ` David Miller
2015-02-24 22:14       ` Alexander Duyck
2015-02-24 22:47   ` Julian Anastasov
2015-02-24 23:09   ` Julian Anastasov
2015-02-24 20:48 ` [RFC PATCH 02/29] fib_trie: Replace plen with slen in leaf_info Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 03/29] fib_trie: Add slen to fib alias Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 04/29] fib_trie: Remove leaf_info Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 05/29] fib_trie: Only resize N/2 times instead N * log(N) times in fib_table_flush Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 06/29] fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 07/29] fib_trie: Fib find node should return parent Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 08/29] fib_trie: Update insert and delete to make use of tp from find_node Alexander Duyck
2015-02-24 20:48 ` [RFC PATCH 09/29] fib_trie: Make fib_table rcu safe Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 10/29] fib_trie: Return pointer to tnode pointer in resize/inflate/halve Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 11/29] fib_trie: Rename tnode to key_vector Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 12/29] fib_trie: move leaf and tnode to occupy the same spot in the key vector Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 13/29] fib_trie: replace tnode_get_child functions with get_child macros Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 14/29] fib_trie: Rename tnode_child_length to child_length Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 15/29] fib_trie: Add tnode struct as a container for fields not needed in key_vector Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 16/29] fib_trie: Move rcu from key_vector to tnode, add accessors Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 17/29] fib_trie: Pull empty_children and full_children into tnode Alexander Duyck
2015-02-24 20:49 ` [RFC PATCH 18/29] fib_trie: Move parent from key_vector to tnode Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 19/29] fib_trie: Add key vector to root, return parent key_vector in resize Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 20/29] fib_trie: Push net pointer down into fib_trie insert/delete/flush calls Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 21/29] fib_trie: Rewrite handling of RCU to include parent in replacement Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 22/29] fib_trie: Allocate tnode as array of key_vectors instead of key_vector as array of tnode pointers Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 23/29] fib_trie: Add leaf_init Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 24/29] fib_trie: Update tnode_new to drop use of put_child_root Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 25/29] fib_trie: Add function for dropping children from trie Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 26/29] fib_trie: Use put_child to only copy key_vectors instead of pointers Alexander Duyck
2015-02-24 20:50 ` [RFC PATCH 27/29] fib_trie: Move key and pos into key_vector from tnode Alexander Duyck
2015-02-24 20:51 ` [RFC PATCH 28/29] fib_trie: Move slen from tnode to key vector Alexander Duyck
2015-02-24 20:51 ` [RFC PATCH 29/29] fib_trie: Push bits up one level, and move leaves up into parent key_vector array Alexander Duyck
2015-02-25  3:53 ` [RFC PATCH 00/29] Phase 2 of fib_trie updates David Miller
2015-02-25  5:12   ` Alexander Duyck [this message]
2015-02-27 21:01     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54ED59D0.9010401@gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@redhat.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).