netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
@ 2014-12-22 17:40 Alexander Duyck
  2014-12-22 17:40 ` [RFC PATCH 01/17] fib_trie: Update usage stats to be percpu instead of global variables Alexander Duyck
                   ` (18 more replies)
  0 siblings, 19 replies; 30+ messages in thread
From: Alexander Duyck @ 2014-12-22 17:40 UTC (permalink / raw)
  To: netdev

These patches are meant to address several performance issues I have seen 
in the fib_trie implementation, and fib_table_lookup specifically.  With 
these changes in place I have seen a reduction of up to 35 to 75% for the 
total time spent in fib_table_lookup depending on the type of search being 
performed.

On a VM running in my Corei7-4930K system with a trie of maximum depth of 7 
this resulted in a reduction of over 370ns per packet in the total time to 
process packets received from an ixgbe interface and route them to a dummy 
interface.  This represents a failed lookup in the local trie followed by 
a successful search in the main trie.

				Baseline	Refactor
  ixgbe->dummy routing		1.20Mpps	2.21Mpps
  ------------------------------------------------------------
  processing time per packet		835ns		453ns
  fib_table_lookup		50.1%	418ns	25.0%	113ns
  check_leaf.isra.9		 7.9%	 66ns	   --	 --
  ixgbe_clean_rx_irq		 5.3%	 44ns	 9.8%	 44ns
  ip_route_input_noref		 2.9%	 25ns	 4.6%	 21ns
  pvclock_clocksource_read	 2.6%	 21ns	 4.6%	 21ns
  ip_rcv			 2.6%	 22ns	 4.0%	 18ns

In the simple case of receiving a frame and dropping it before it can reach 
the socket layer I saw a reduction of 40ns per packet.  This represents a 
trip through the local trie with the correct leaf found with no need for 
any backtracing.

				Baseline	Refactor
  ixgbe->local receive		2.65Mpps	2.96Mpps
  ------------------------------------------------------------
  processing time per packet		377ns		337ns
  fib_table_lookup		25.1%	 95ns	25.8%	 87ns
  ixgbe_clean_rx_irq		 8.7%	 33ns	 9.0%	 30ns
  check_leaf.isra.9		 7.2%	 27ns	   --	 --
  ip_rcv			 5.7%	 21ns	 6.5%	 22ns

These changes have resulted in several functions being inlined such as 
check_leaf and fib_find_node, but due to the code simplification the 
overall size of the code has been reduced.

   text	   data	    bss	    dec	    hex	filename
  16932	    376	     16	  17324	   43ac	net/ipv4/fib_trie.o - before
  15259	    376	      8	  15643	   3d1b	net/ipv4/fib_trie.o - after

---

Alexander Duyck (17):
      fib_trie: Update usage stats to be percpu instead of global variables
      fib_trie: Make leaf and tnode more uniform
      fib_trie: Merge tnode_free and leaf_free into node_free
      fib_trie: Merge leaf into tnode
      fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables
      fib_trie: Optimize fib_find_node
      fib_trie: Optimize fib_table_insert
      fib_trie: Update meaning of pos to represent unchecked bits
      fib_trie: Use unsigned long for anything dealing with a shift by bits
      fib_trie: Push rcu_read_lock/unlock to callers
      fib_trie: Move resize to after inflate/halve
      fib_trie: Add functions should_inflate and should_halve
      fib_trie: Push assignment of child to parent down into inflate/halve
      fib_trie: Push tnode flushing down to inflate/halve
      fib_trie: inflate/halve nodes in a more RCU friendly way
      fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child
      fib_trie: Add tracking value for suffix length


 include/net/ip_fib.h    |   50 +
 net/ipv4/fib_frontend.c |   29 -
 net/ipv4/fib_rules.c    |   22 -
 net/ipv4/fib_trie.c     | 1915 ++++++++++++++++++++++-------------------------
 4 files changed, 945 insertions(+), 1071 deletions(-)

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-12-22 23:38 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-22 17:40 [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% Alexander Duyck
2014-12-22 17:40 ` [RFC PATCH 01/17] fib_trie: Update usage stats to be percpu instead of global variables Alexander Duyck
2014-12-22 19:21   ` Cong Wang
2014-12-22 17:41 ` [RFC PATCH 02/17] fib_trie: Make leaf and tnode more uniform Alexander Duyck
2014-12-22 18:33   ` David Miller
2014-12-22 18:55     ` Alexander Duyck
2014-12-22 21:50       ` David Miller
2014-12-22 23:38         ` Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 03/17] fib_trie: Merge tnode_free and leaf_free into node_free Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 04/17] fib_trie: Merge leaf into tnode Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 05/17] fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables Alexander Duyck
2014-12-22 18:30   ` David Miller
2014-12-22 17:41 ` [RFC PATCH 06/17] fib_trie: Optimize fib_find_node Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 07/17] fib_trie: Optimize fib_table_insert Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 08/17] fib_trie: Update meaning of pos to represent unchecked bits Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 09/17] fib_trie: Use unsigned long for anything dealing with a shift by bits Alexander Duyck
2014-12-22 17:41 ` [RFC PATCH 10/17] fib_trie: Push rcu_read_lock/unlock to callers Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 11/17] fib_trie: Move resize to after inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 12/17] fib_trie: Add functions should_inflate and should_halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 13/17] fib_trie: Push assignment of child to parent down into inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 14/17] fib_trie: Push tnode flushing down to inflate/halve Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 15/17] fib_trie: inflate/halve nodes in a more RCU friendly way Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 16/17] fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child Alexander Duyck
2014-12-22 17:42 ` [RFC PATCH 17/17] fib_trie: Add tracking value for suffix length Alexander Duyck
2014-12-22 18:32   ` David Miller
2014-12-22 18:08 ` [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% Dave Taht
2014-12-22 18:38   ` Alexander Duyck
2014-12-22 18:59     ` Dave Taht
2014-12-22 18:53   ` David Miller
2014-12-22 18:35 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).