netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception
@ 2015-05-23  3:55 Martin KaFai Lau
  2015-05-23  3:55 ` [PATCH net-next v5 01/11] ipv6: Clean up ipv6_select_ident() and ip6_fragment() Martin KaFai Lau
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Martin KaFai Lau @ 2015-05-23  3:55 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, Hannes Frederic Sowa, Julian Anastasov,
	Steffen Klassert, Kernel Team

v4 -> v5:
- Patch 1 is new. Clean up the ipv6_select_ident() and ip6_fragment().

- Further simplify the newly added rt6_get_pcpu_route().  If there is a
  'prev' after cmpxchg, return prev instead of the newly created percpu
  clone.

v3 -> v4:
- Patch 8 is new. It keeps track of the DST_NOCACHE routes in a list to handle
  the iface down/unregister event.

- Remove rcu from the newly added rt6i_pcpu variable.  It is not needed
  because it has already been protected by the existing reader/writer lock.

- Thanks to 'Julian Anastasov <ja@ssi.bg>' for testing the FLOWI_FLAG_KNOWN_NH
  patches.

v2 -> v3:
- Patch 5 to 7 are new.  They take care of cases where the daddr in
  skb is not the one used to do the route look-up.  There is also
  related changes to rt6_nexthop() since v2 which is in patch 2/9.
  Thanks to 'Julian Anastasov <ja@ssi.bg>' for pointing it out.

- Fix a few problems in __ip6_rt_update_pmtu(), like setting the expire
  and mtu before inserting to the tree and don't do dst_destroy() after
  tree insertion failure.  Also update the rt6i_pmtu in fib6_add_rt2node().
  Thanks to 'Steffen Klassert <steffen.klassert@secunet.com>' for pointing
  it out.

- Merge ip6_pmtu_rt_cache_alloc() into ip6_rt_cache_alloc().

v1 -> v2:
- Move the /128 route bug fixes to another series (accepted).
- Create a function for checking (rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)).
- Avoid shuffling the skb network_header.  Instead, change the function
  signature to take iph instead of skb.

- Many Thanks to 'Hannes Frederic Sowa <hannes@stressinduktion.org>' on
  reviewing v1 and v2 and giving advice.

--Martin

~~~ start: v1 compose message (with the out-dated parts removed) ~~~

This series is to avoid creating a RTF_CACHE route whenever we are consulting
the fib6 tree with a new destination.  Instead, only create RTF_CACHE route
when we see a pmtu exception.

Out of all ipv6 RTF_CACHE routes that are created, the percentage that has a
different mtu is very small. In one of our end-user facing proxy server,
only 1k out of 80k RTF_CACHE routes have a smaller MTU.  For our DC
traffic, there is no mtu exception.

A large fib6 tree has problems like, 'ip -6 r show' takes a long time.
gc may kick in too often.  Also, when a service has restarted and a lot
of new TCP conn requests come in, it creates pressure on the tree by inserting
a lot of RTF_CACHE in a short time and it currently requires a write lock
to do that.

The first few patches are prep works to remove assumption that the
returned rt is always RTF_CACHE.

The patch 'ipv6: Only create RTF_CACHE routes after encountering pmtu exception'
do the lazy RTF_CACHE route creation.

The following patches added percpu rt to compensate the performance loss after
doing the RTF_CACHE lazy creation.

Here is some numbers of the udpflood test.  The udpflood has been
slightly modified to have a time limit instead of count limit.

A /64 via gateway route is used for the test. Each udpflood uses 10000 dst
addresses.  The dst addresses of different udpflood processes do not overlap
with each other.

# of udpflood        # of trans (patched)        # of trans (upstream)

1                    16M                          15M
10                   61M                          61M
20                   65M                          62M
40                   88M                          83M

~~~ end: v1 compose message ~~~

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-08-28 18:27 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-23  3:55 [PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception Martin KaFai Lau
2015-05-23  3:55 ` [PATCH net-next v5 01/11] ipv6: Clean up ipv6_select_ident() and ip6_fragment() Martin KaFai Lau
2015-05-23  3:55 ` [PATCH net-next v5 02/11] ipv6: Remove external dependency on rt6i_dst and rt6i_src Martin KaFai Lau
2015-05-23  3:55 ` [PATCH net-next v5 03/11] ipv6: Remove external dependency on rt6i_gateway and RTF_ANYCAST Martin KaFai Lau
2015-05-23  3:55 ` [PATCH net-next v5 04/11] ipv6: Combine rt6_alloc_cow and rt6_alloc_clone Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 05/11] ipv6: Only create RTF_CACHE routes after encountering pmtu exception Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 06/11] ipv6: Add rt6_get_cookie() function Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 07/11] ipv6: Set FLOWI_FLAG_KNOWN_NH at flowi6_flags Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 08/11] ipv6: Create RTF_CACHE clone when FLOWI_FLAG_KNOWN_NH is set Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 09/11] ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 10/11] ipv6: Break up ip6_rt_copy() Martin KaFai Lau
2015-05-23  3:56 ` [PATCH net-next v5 11/11] ipv6: Create percpu rt6_info Martin KaFai Lau
2015-05-25 17:34 ` [PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception David Miller
2015-05-26 21:20   ` Hannes Frederic Sowa
2015-05-26 21:34     ` Martin KaFai Lau
2015-07-29  9:25 ` Alexander Holler
2015-07-30 11:57   ` Alexander Holler
2015-08-15  7:48     ` Alexander Holler
2015-08-17  9:43       ` Alexander Holler
2015-08-28  7:36         ` Martin KaFai Lau
2015-08-28  9:27           ` Alexander Holler
2015-08-28  9:34             ` Alexander Holler
2015-08-28 18:27           ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).