All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Zealey <netdev@markandruth.co.uk>
To: netdev@vger.kernel.org
Subject: Re: UDP multi-core performance on a single socket and SO_REUSEPORT
Date: Fri, 04 Jan 2013 18:50:59 +0000	[thread overview]
Message-ID: <50E72493.4050406@markandruth.co.uk> (raw)
In-Reply-To: <50DD6DF1.7080304@markandruth.co.uk>

I have written two small test scripts now which can be found at 
http://mark.zealey.org/uploads/ - one launches 16 listening threads for 
a single UDP socket, the other needs to be run as

for i in `seq 16`; do ./udp_test_client & done

On my test server (32-core), stock kernel 3.7.1, 90% of the time is 
spent in the kernel waiting on spinlocks. Perf output:

     44.95%  udp_test_server  [kernel.kallsyms]   [k] _raw_spin_lock_bh
             |
             --- _raw_spin_lock_bh
                |
                |--100.00%-- lock_sock_fast
                |          skb_free_datagram_locked
                |          udp_recvmsg
                |          inet_recvmsg
                |          sock_recvmsg
                |          __sys_recvmsg
                |          sys_recvmsg
                |          system_call_fastpath
                |          0x7fd8c4702a2d
                |          start_thread
                 --0.00%-- [...]

     43.48%  udp_test_client  [kernel.kallsyms]   [k] _raw_spin_lock
             |
             --- _raw_spin_lock
                |
                |--99.80%-- udp_queue_rcv_skb
                |          __udp4_lib_rcv
                |          udp_rcv
                |          ip_local_deliver_finish
                |          ip_local_deliver
                |          ip_rcv_finish
                |          ip_rcv

Thanks,

Mark

On 28/12/12 10:01, Mark Zealey wrote:
> I appreciate that this question has come up a number of times over the 
> years, most recently as far as I can see in this thread: 
> http://markmail.org/message/hcc7zn5ln5wktypv . I'm going to explain my 
> problem and present some performance numbers to back this up.
>
> The problem: I'm doing some research on scaling a dns server 
> (powerdns) to work well on multi-core boxes (in this case testing with 
> 2*E5-2650 processors ie linux sees 32 cores).
>
> My powerdns configuration uses a shared socket with one thread for 
> each core in the box listening on that socket using poll()/recvmsg(). 
> I've modified powerdns so in my tests it is doing the absolute minimum 
> of work to answer packets (all queries are for the same record, it 
> keeps the response in memory and just changes a few fields before 
> calling sendmsg()). I'm binding to a single 10.xxx address and using 
> this for all local and remote tests.
>
> The numbers below are generated using 16 parallel queryperf's on 
> localhost (it doesn't really matter if it is from remote hosts or the 
> localhost; the numbers don't change much).
>
> Using stock centos 6.3 kernel I see powerdns performing at around 
> 120kqps (uses at most about 12 cpus)
> Using 3.7.1 kernel (from elrepo) I see this increase to 200-240kqps 
> maxing out all cpu's in the box (soft interrupt cpu time is about 8* 
> higher than on centos 6.3 kernel at 40% and system cpu time is at 50% 
> - powerdns only uses 10% of the cpu time)
> Using stock centos 6.3 kernel with the google SO_REUSEPORT patch from 
> 2010 (modified slightly so it applies) I see 500-600kqps from remote; 
> or 1mqps when doing localhost queries. powerdns doesn't go past using 
> 8 cpus - it appears that the limit it is hitting then is to do with 
> some lock in sendmsg().
>
> I've not been able to get the 2010 SO_REUSEPORT patch working on the 
> 3.7.1 kernel I suspect it would make for even better performance as 
> sendmsg() should have been significantly improved.
>
> Now, I don't believe that SO_REUSEPORT is needed in the kernel in this 
> case, however the numbers above clearly show that the current UDP 
> implementation for recvmsg() on a single socket across multiple cores 
> on kernel 3.7.1 is still locking badly. A perf report on 3.7.1 (using 
> 16 local queryperf's) shows:
>
>     68.34%  pdns_server  [kernel.kallsyms]    [k] _raw_spin_lock_bh
>             |
>             --- 0x7fa472023a2d
>                 system_call_fastpath
>                 sys_recvmsg
>                 __sys_recvmsg
>                 sock_recvmsg
>                 inet_recvmsg
>                 udp_recvmsg
>                 skb_free_datagram_locked
>                |
>                |--100.00%-- lock_sock_fast
>                |          _raw_spin_lock_bh
>                 --0.00%-- [...]
>
>      3.10%  pdns_server  [kernel.kallsyms]    [k] _raw_spin_lock_irqsave
>             |
>             --- 0x7fa472023a2d
>                 system_call_fastpath
>                 sys_recvmsg
>                 __sys_recvmsg
>                 sock_recvmsg
>                 inet_recvmsg
>                 udp_recvmsg
>                |
>                |--99.69%-- __skb_recv_datagram
>                |          |
>                |          |--77.68%-- _raw_spin_lock_irqsave
>                |          |
>                |          |--14.56%-- prepare_to_wait_exclusive
>                |          |          _raw_spin_lock_irqsave
>                |          |
>                |           --7.76%-- finish_wait
>                |                     _raw_spin_lock_irqsave
>                 --0.31%-- [...]
>                ...
>
> Any advice or patches welcome... :-)
>
> Mark
>

  reply	other threads:[~2013-01-04 18:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-28 10:01 UDP multi-core performance on a single socket and SO_REUSEPORT Mark Zealey
2013-01-04 18:50 ` Mark Zealey [this message]
2013-01-04 19:37   ` Eric Dumazet
2013-01-04 20:47     ` Tom Herbert
2013-01-04 21:46       ` Mark Zealey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50E72493.4050406@markandruth.co.uk \
    --to=netdev@markandruth.co.uk \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.