From: Alexander Duyck <alexander.h.duyck@redhat.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Subject: Re: [RFC PATCH 02/17] fib_trie: Make leaf and tnode more uniform
Date: Mon, 22 Dec 2014 15:38:53 -0800 [thread overview]
Message-ID: <5498AB8D.8070301@redhat.com> (raw)
In-Reply-To: <20141222.165015.1694288362513627826.davem@davemloft.net>
On 12/22/2014 01:50 PM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Mon, 22 Dec 2014 10:55:17 -0800
>
>> My real concern with all of this is the fact that we have to do 2
>> separate memory reads per node, one for the key info and one for the
>> child pointer. I really think we need to get this down to 1 in order
>> to get there, but the overhead is the tricky part for that. What I
>> would look at doing is splitting the tnode into two parts. One would
>> be a key vector (key, pos, bits, seq) paired with a pointer to either
>> a tnode_info or leaf_info, the other would be something like a
>> tnode_info (rcu, parent pointer, full_children, empty_children, key
>> vector array[0]) that provides a means of backtracing and stores the
>> nodes. The problem is it makes insertion/deletion and backtracking
>> more complicated and doubles (64b) or quadruples (32b) the memory
>> needed as such I am still just throwing the idea around and haven't
>> gotten into implementation yet.
> I think calling into this code twice for every non-local FIB lookup
> has costs as well.
Agreed. I remember when you sent me the patch to test out merging the
two tries a few years back that did make a significant difference.
By my math the difference with these patches is probably around 26ns
extra, 113ns vs 87ns, to do the second table lookup. The changes in
adding/checking the suffix length and the fact that we validate the key
is usable as a prefix before reading the leaf_info made the first table
lookup much cheaper than it was before so merging the tables should
provide less of a gain.
What it will come down to is if the cost to backtrack however many bits
it takes to get to the route is greater than the cost to start the
search over in the main trie. In the case of subnets with a short
prefix you may end up finding that the cost to backtrack will be more
expensive for addresses at the end of the address range with a large
number of bits set in the host identifier.
> And yes I agree with you that the memory references matter a lot.
Especially now. Considering the loop for lookup is fairly small the
only thing that is really holding it back is the memory access latency.
That is why I am playing around with the idea of doubling the memory
footprint for the trie just to make it so that the key and the pointer
could co-exist in the same 16B region. I'm hoping it would allow us to
cut total memory access latency in half and double the speed at which we
could process tnodes.
Thanks for the feedback.
- Alex
next prev parent reply other threads:[~2014-12-22 23:38 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-22 17:40 [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% Alexander Duyck
2014-12-22 17:40 ` [RFC PATCH 01/17] fib_trie: Update usage stats to be percpu instead of global variables Alexander Duyck
2014-12-22 19:21 ` Cong Wang
2014-12-22 17:41 ` [RFC PATCH 02/17] fib_trie: Make leaf and tnode more uniform Alexander Duyck
2014-12-22 18:33 ` David Miller
2014-12-22 18:55 ` Alexander Duyck
2014-12-22 21:50 ` David Miller
2014-12-22 23:38 ` Alexander Duyck [this message]
2014-12-22 17:41 ` [RFC PATCH 03/17] fib_trie: Merge tnode_free and leaf_free into node_free Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 04/17] fib_trie: Merge leaf into tnode Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 05/17] fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables Alexander Duyck
2014-12-22 18:30 ` David Miller
2014-12-22 17:41 ` [RFC PATCH 06/17] fib_trie: Optimize fib_find_node Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 07/17] fib_trie: Optimize fib_table_insert Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 08/17] fib_trie: Update meaning of pos to represent unchecked bits Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 09/17] fib_trie: Use unsigned long for anything dealing with a shift by bits Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 10/17] fib_trie: Push rcu_read_lock/unlock to callers Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 11/17] fib_trie: Move resize to after inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 12/17] fib_trie: Add functions should_inflate and should_halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 13/17] fib_trie: Push assignment of child to parent down into inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 14/17] fib_trie: Push tnode flushing down to inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 15/17] fib_trie: inflate/halve nodes in a more RCU friendly way Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 16/17] fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 17/17] fib_trie: Add tracking value for suffix length Alexander Duyck
2014-12-22 18:32 ` David Miller
2014-12-22 18:08 ` [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% Dave Taht
2014-12-22 18:38 ` Alexander Duyck
2014-12-22 18:59 ` Dave Taht
2014-12-22 18:53 ` David Miller
2014-12-22 18:35 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5498AB8D.8070301@redhat.com \
--to=alexander.h.duyck@redhat.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).