* [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
@ 2014-12-11  5:49 Alexander Duyck
  2014-12-12  2:32 ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2014-12-11  5:49 UTC (permalink / raw)
  To: netdev; +Cc: davem
This patch addresses an issue with the level compression of the fib_trie.
Specifically in the case of adding a new leaf that triggers a new node to
be added that takes the place of the old node.  The result is a trie where
the 1 child tnode is on one side and one leaf is on the other which gives
you a very deep trie.  Below is the script I used to generate a trie on
dummy0 with a 10.X.X.X family of addresses.
  ip link add type dummy
  ipval=184549374
  bit=2
  for i in `seq 1 23`
  do
    ifconfig dummy0:$bit $ipval/8
    ipval=`expr $ipval - $bit`
    bit=`expr $bit \* 2`
  done
  cat /proc/net/fib_triestat
Running the script before the patch:
	Local:
		Aver depth:     10.82
		Max depth:      23
		Leaves:         29
		Prefixes:       30
		Internal nodes: 27
		  1: 26  2: 1
		Pointers: 56
	Null ptrs: 1
	Total size: 5  kB
After applying the patch and repeating:
	Local:
		Aver depth:     4.72
		Max depth:      9
		Leaves:         29
		Prefixes:       30
		Internal nodes: 12
		  1: 3  2: 2  3: 7
		Pointers: 70
	Null ptrs: 30
	Total size: 4  kB
What this fix does is start the rebalance at the newly created tnode
instead of at the parent tnode.  This way if there is a gap between the
parent and the new node it doesn't prevent the new tnode from being
coalesced with any pre-existing nodes that may have been pushed into one
of the new nodes child branches.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/ipv4/fib_trie.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index e9cb258..18bcaf2 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1143,8 +1143,9 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
 			put_child(tp, cindex, (struct rt_trie_node *)tn);
 		} else {
 			rcu_assign_pointer(t->trie, (struct rt_trie_node *)tn);
-			tp = tn;
 		}
+
+		tp = tn;
 	}
 
 	if (tp && tp->pos + tp->bits > 32)
^ permalink raw reply related	[flat|nested] 7+ messages in thread- * Re: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-11  5:49 [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node Alexander Duyck
@ 2014-12-12  2:32 ` David Miller
  2014-12-12 15:54   ` David Miller
  2014-12-12 15:55   ` Alexander Duyck
  0 siblings, 2 replies; 7+ messages in thread
From: David Miller @ 2014-12-12  2:32 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev
From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Wed, 10 Dec 2014 21:49:22 -0800
> This patch addresses an issue with the level compression of the fib_trie.
> Specifically in the case of adding a new leaf that triggers a new node to
> be added that takes the place of the old node.  The result is a trie where
> the 1 child tnode is on one side and one leaf is on the other which gives
> you a very deep trie.  Below is the script I used to generate a trie on
> dummy0 with a 10.X.X.X family of addresses.
 ...
> What this fix does is start the rebalance at the newly created tnode
> instead of at the parent tnode.  This way if there is a gap between the
> parent and the new node it doesn't prevent the new tnode from being
> coalesced with any pre-existing nodes that may have been pushed into one
> of the new nodes child branches.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
One has to be mindful with this code that what it's doing now might
be intentional.  For example, it might be doing things this way
on purpose in order to minimize rebalancing during route flaps.
Barring anything like that, I think your change is fine.
^ permalink raw reply	[flat|nested] 7+ messages in thread 
- * Re: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-12  2:32 ` David Miller
@ 2014-12-12 15:54   ` David Miller
  2014-12-12 16:09     ` Alexander Duyck
  2014-12-12 15:55   ` Alexander Duyck
  1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2014-12-12 15:54 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev
From: David Miller <davem@davemloft.net>
Date: Thu, 11 Dec 2014 21:32:16 -0500 (EST)
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Wed, 10 Dec 2014 21:49:22 -0800
> 
>> This patch addresses an issue with the level compression of the fib_trie.
>> Specifically in the case of adding a new leaf that triggers a new node to
>> be added that takes the place of the old node.  The result is a trie where
>> the 1 child tnode is on one side and one leaf is on the other which gives
>> you a very deep trie.  Below is the script I used to generate a trie on
>> dummy0 with a 10.X.X.X family of addresses.
>  ...
>> What this fix does is start the rebalance at the newly created tnode
>> instead of at the parent tnode.  This way if there is a gap between the
>> parent and the new node it doesn't prevent the new tnode from being
>> coalesced with any pre-existing nodes that may have been pushed into one
>> of the new nodes child branches.
>> 
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> 
> One has to be mindful with this code that what it's doing now might
> be intentional.  For example, it might be doing things this way
> on purpose in order to minimize rebalancing during route flaps.
> 
> Barring anything like that, I think your change is fine.
Alex, in case it isn't clear, I'm hoping that you might have some
thoughts on this aspect of your changes. :)
^ permalink raw reply	[flat|nested] 7+ messages in thread 
- * Re: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-12 15:54   ` David Miller
@ 2014-12-12 16:09     ` Alexander Duyck
  2014-12-12 17:05       ` David Laight
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2014-12-12 16:09 UTC (permalink / raw)
  To: David Miller, alexander.h.duyck; +Cc: netdev
On 12/12/2014 07:54 AM, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Thu, 11 Dec 2014 21:32:16 -0500 (EST)
>
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>> Date: Wed, 10 Dec 2014 21:49:22 -0800
>>
>>> This patch addresses an issue with the level compression of the fib_trie.
>>> Specifically in the case of adding a new leaf that triggers a new node to
>>> be added that takes the place of the old node.  The result is a trie where
>>> the 1 child tnode is on one side and one leaf is on the other which gives
>>> you a very deep trie.  Below is the script I used to generate a trie on
>>> dummy0 with a 10.X.X.X family of addresses.
>>  ...
>>> What this fix does is start the rebalance at the newly created tnode
>>> instead of at the parent tnode.  This way if there is a gap between the
>>> parent and the new node it doesn't prevent the new tnode from being
>>> coalesced with any pre-existing nodes that may have been pushed into one
>>> of the new nodes child branches.
>>>
>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
>> One has to be mindful with this code that what it's doing now might
>> be intentional.  For example, it might be doing things this way
>> on purpose in order to minimize rebalancing during route flaps.
>>
>> Barring anything like that, I think your change is fine.
> Alex, in case it isn't clear, I'm hoping that you might have some
> thoughts on this aspect of your changes. :)
Route flaps should at most trigger an inflate without a deflate due to
the way the resize logic works.  If this occurs often it would be useful
since then there would already be space in the tree for the next time
around.
- Alex
^ permalink raw reply	[flat|nested] 7+ messages in thread 
- * RE: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-12 16:09     ` Alexander Duyck
@ 2014-12-12 17:05       ` David Laight
  0 siblings, 0 replies; 7+ messages in thread
From: David Laight @ 2014-12-12 17:05 UTC (permalink / raw)
  To: 'Alexander Duyck', David Miller,
	alexander.h.duyck@redhat.com
  Cc: netdev@vger.kernel.org
From: Alexander Duyck
...
> >> One has to be mindful with this code that what it's doing now might
> >> be intentional.  For example, it might be doing things this way
> >> on purpose in order to minimize rebalancing during route flaps.
> >>
> >> Barring anything like that, I think your change is fine.
> > Alex, in case it isn't clear, I'm hoping that you might have some
> > thoughts on this aspect of your changes. :)
> 
> Route flaps should at most trigger an inflate without a deflate due to
> the way the resize logic works.  If this occurs often it would be useful
> since then there would already be space in the tree for the next time
> around.
One way to deal with 'flapping' is to leave deleted items in the tree
(marked so they won't be used).
Deleted entries do need to be garbage collected at some point.
Doing so during insert is more than adequate, and possibly only for
nodes that are traversed while processing the insert itself
(and maybe only is the ratio of valid to invalid entries is low).
This might also mean that you don't need to actively run any timeouts.
Timed out entries just stay put - marked invalid by their timestamp.
	David
^ permalink raw reply	[flat|nested] 7+ messages in thread 
 
 
- * Re: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-12  2:32 ` David Miller
  2014-12-12 15:54   ` David Miller
@ 2014-12-12 15:55   ` Alexander Duyck
  2014-12-12 16:00     ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2014-12-12 15:55 UTC (permalink / raw)
  To: David Miller, alexander.h.duyck; +Cc: netdev
On 12/11/2014 06:32 PM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Wed, 10 Dec 2014 21:49:22 -0800
>
>> This patch addresses an issue with the level compression of the fib_trie.
>> Specifically in the case of adding a new leaf that triggers a new node to
>> be added that takes the place of the old node.  The result is a trie where
>> the 1 child tnode is on one side and one leaf is on the other which gives
>> you a very deep trie.  Below is the script I used to generate a trie on
>> dummy0 with a 10.X.X.X family of addresses.
>  ...
>> What this fix does is start the rebalance at the newly created tnode
>> instead of at the parent tnode.  This way if there is a gap between the
>> parent and the new node it doesn't prevent the new tnode from being
>> coalesced with any pre-existing nodes that may have been pushed into one
>> of the new nodes child branches.
>>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> One has to be mindful with this code that what it's doing now might
> be intentional.  For example, it might be doing things this way
> on purpose in order to minimize rebalancing during route flaps.
>
> Barring anything like that, I think your change is fine.
I'm fairly certain that this isn't intentional.  If we replace a NULL
pointer in an existing tnode then we rebalance starting at that tnode,
it is only when there is no room in the trie and we have to add a new
tnode that the issue occurs where we rebalance at the parent and not the
tnode that the leaf was added to.
- Alex
^ permalink raw reply	[flat|nested] 7+ messages in thread 
- * Re: [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node
  2014-12-12 15:55   ` Alexander Duyck
@ 2014-12-12 16:00     ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2014-12-12 16:00 UTC (permalink / raw)
  To: alexander.duyck; +Cc: alexander.h.duyck, netdev
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 12 Dec 2014 07:55:02 -0800
> On 12/11/2014 06:32 PM, David Miller wrote:
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>> Date: Wed, 10 Dec 2014 21:49:22 -0800
>>
>>> This patch addresses an issue with the level compression of the fib_trie.
>>> Specifically in the case of adding a new leaf that triggers a new node to
>>> be added that takes the place of the old node.  The result is a trie where
>>> the 1 child tnode is on one side and one leaf is on the other which gives
>>> you a very deep trie.  Below is the script I used to generate a trie on
>>> dummy0 with a 10.X.X.X family of addresses.
>>  ...
>>> What this fix does is start the rebalance at the newly created tnode
>>> instead of at the parent tnode.  This way if there is a gap between the
>>> parent and the new node it doesn't prevent the new tnode from being
>>> coalesced with any pre-existing nodes that may have been pushed into one
>>> of the new nodes child branches.
>>>
>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
>> One has to be mindful with this code that what it's doing now might
>> be intentional.  For example, it might be doing things this way
>> on purpose in order to minimize rebalancing during route flaps.
>>
>> Barring anything like that, I think your change is fine.
> 
> I'm fairly certain that this isn't intentional.  If we replace a NULL
> pointer in an existing tnode then we rebalance starting at that tnode,
> it is only when there is no room in the trie and we have to add a new
> tnode that the issue occurs where we rebalance at the parent and not the
> tnode that the leaf was added to.
Ok, thanks for taking the time to explain this, I'm now convinced :)
Applied, thanks again.
^ permalink raw reply	[flat|nested] 7+ messages in thread 
 
 
end of thread, other threads:[~2014-12-12 17:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-11  5:49 [net PATCH] fib_trie: Fix trie balancing issue if new node pushes down existing node Alexander Duyck
2014-12-12  2:32 ` David Miller
2014-12-12 15:54   ` David Miller
2014-12-12 16:09     ` Alexander Duyck
2014-12-12 17:05       ` David Laight
2014-12-12 15:55   ` Alexander Duyck
2014-12-12 16:00     ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).