netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: hideaki.yoshifuji@miraclelinux.com
Cc: eric.dumazet@gmail.com, weiwan@google.com,
	netdev@vger.kernel.org, edumazet@google.com, kafai@fb.com,
	yoshfuji@linux-ipv6.org
Subject: Re: [PATCH net-next 00/16] ipv6: replace rwlock with rcu and spinlock in fib6 table
Date: Sat, 07 Oct 2017 21:28:31 +0100 (WEST)	[thread overview]
Message-ID: <20171007.212831.1578627314815022241.davem@davemloft.net> (raw)
In-Reply-To: <CAPA1RqBSKi9ra_UBur6L9HHExZzk9BxSrwSKn2xtvAC5K54N+A@mail.gmail.com>

From: 吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Date: Sat, 7 Oct 2017 18:25:13 +0900

> Hi,
> 
> 2017-10-07 8:49 GMT+09:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> On Fri, 2017-10-06 at 12:05 -0700, Wei Wang wrote:
>>> From: Wei Wang <weiwan@google.com>
>>>
>>> Currently, fib6 table is protected by rwlock. During route lookup,
>>> reader lock is taken and during route insertion, deletion or
>>> modification, writer lock is taken. This is a very inefficient
>>> implementation because the fastpath always has to do the operation
>>> to grab the reader lock.
>>> According to my latest syn flood test on an iota ivybridage machine
>>> with 2 10G mlx nics bonded together, each with 8 rx queues on 2 NUMA
>>> nodes, and with the upstream net-next kernel:
>>> ipv4 stack can handle around 4.2Mpps
>>> ipv6 stack can handle around 1.3Mpps
>>>
>>> In order to close the gap of the performance number between ipv4
>>> and ipv6 stack, this patch series tries to get rid of the usage of
>>> the rwlock and replace it with rcu and spinlock protection. This will
>>> greatly speed up the fastpath performance as it only needs to hold
>>> rcu which is much less expensive than grabbing the reader lock. It
>>> also makes ipv6 fib implementation more consistent with ipv4.
 ...
>> Awesome work Wei.
>>
>> For the whole series :
>>
>> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> It looks ok to me.
> Reviewed-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

I have some reservations about these changes, fib6_info gets bigger,
etc.

And even with the amazing developers that helped review and
audit these changes already, I can guarantee there are some
bugs in here just like there were bugs in the ipv4 routing
cache removal I did :-)

But those don't block integration, for sure.

So series applied, thanks a lot for doing this!

I think there is some code that doesn't use proper RCU accessors
for rt6i_exception_bucket.  For example there are some assignments
of it to NULL that should use RCU_ASSIGN_FOO() or similar.  Please
take a lok and fix those up.

Thanks!

  reply	other threads:[~2017-10-07 20:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-06 19:05 [PATCH net-next 00/16] ipv6: replace rwlock with rcu and spinlock in fib6 table Wei Wang
2017-10-06 19:05 ` [PATCH net-next 01/16] ipv6: introduce a new function fib6_update_sernum() Wei Wang
2017-10-06 19:05   ` [PATCH net-next 02/16] ipv6: introduce a hash table to store dst cache Wei Wang
2017-10-06 19:05     ` [PATCH net-next 03/16] ipv6: prepare fib6_remove_prefsrc() for exception table Wei Wang
2017-10-06 19:05       ` [PATCH net-next 04/16] ipv6: prepare rt6_mtu_change() " Wei Wang
2017-10-06 19:06         ` [PATCH net-next 05/16] ipv6: prepare rt6_clean_tohost() " Wei Wang
2017-10-06 19:06           ` [PATCH net-next 06/16] ipv6: prepare fib6_age() " Wei Wang
2017-10-06 19:06             ` [PATCH net-next 07/16] ipv6: prepare fib6_locate() " Wei Wang
2017-10-06 19:06               ` [PATCH net-next 08/16] ipv6: hook up exception table to store dst cache Wei Wang
2017-10-06 19:06                 ` [PATCH net-next 09/16] ipv6: grab rt->rt6i_ref before allocating pcpu rt Wei Wang
2017-10-06 19:06                   ` [PATCH net-next 10/16] ipv6: don't release rt->rt6i_pcpu memory during rt6_release() Wei Wang
2017-10-06 19:06                     ` [PATCH net-next 11/16] ipv6: replace dst_hold() with dst_hold_safe() in routing code Wei Wang
2017-10-06 19:06                       ` [PATCH net-next 12/16] ipv6: update fn_sernum after route is inserted to tree Wei Wang
2017-10-06 19:06                         ` [PATCH net-next 13/16] ipv6: check fn->leaf before it is used Wei Wang
2017-10-06 19:06                           ` [PATCH net-next 14/16] ipv6: add key length check into rt6_select() Wei Wang
2017-10-06 19:06                             ` [PATCH net-next 15/16] ipv6: replace rwlock with rcu and spinlock in fib6_table Wei Wang
2017-10-06 19:06                               ` [PATCH net-next 16/16] ipv6: take care of rt6_stats Wei Wang
2017-10-06 23:57                       ` [PATCH net-next 11/16] ipv6: replace dst_hold() with dst_hold_safe() in routing code 吉藤英明
2017-10-07  2:06                         ` Wei Wang
2017-10-07  2:23                         ` David Miller
2017-10-06 23:49 ` [PATCH net-next 00/16] ipv6: replace rwlock with rcu and spinlock in fib6 table Eric Dumazet
2017-10-07  9:25   ` 吉藤英明
2017-10-07 20:28     ` David Miller [this message]
2017-10-09  3:43 ` BUG: using smp_processor_id() in preemptible [00000000] Jakub Kicinski
2017-10-09  3:47   ` Eric Dumazet
2017-10-09  3:50     ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171007.212831.1578627314815022241.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hideaki.yoshifuji@miraclelinux.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=weiwan@google.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).