From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: UDP regression with packets rates < 10k per sec Date: Wed, 09 Sep 2009 19:06:10 +0200 Message-ID: <4AA7E082.90807@gmail.com> References: <4AA6E039.4000907@gmail.com> <4AA7C512.6040100@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: netdev@vger.kernel.org To: Christoph Lameter Return-path: Received: from gw1.cosmosbay.com ([212.99.114.194]:58953 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752453AbZIIRGN (ORCPT ); Wed, 9 Sep 2009 13:06:13 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Christoph Lameter a =E9crit : > On Wed, 9 Sep 2009, Eric Dumazet wrote: >=20 >> Christoph Lameter a ?crit : >>> On Wed, 9 Sep 2009, Eric Dumazet wrote: >>> >>>> In order to reproduce this here, could you tell me if you use >>>> >>>> Producer linux-2.6.22 -> Receiver 2.6.22 >>>> Producer linux-2.6.31 -> Receiver 2.6.31 >>> I use the above setup. >> Then frames are sent on wire but not received >> >> (they are received via mc loop, internal stack magic) >=20 > We are talking about two machines running 2.6.22 or 2.6.31. There is = no > magic mc loop between the two machines. -L was not used. >=20 >> # ./mcast -L -n1 -r 10000 >> WARNING: Multiple active ethernet devices. Using local address 192.1= 68.0.1 >> Receiver: Listening to control channel 239.0.192.1 >> Receiver: Subscribing to 1 MC addresses 239.0.192-254.2-254 offset 0= origin 192.168.0.1 >> Sender: Sending 10000 msgs/ch/sec on 1 channels. Probe interval=3D0.= 001-1 sec. >> >> TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/u= s StdDv >> 100000 0 0 0 10000 3000.0 7.84 8.89 10.51 = 0.66 >=20 > These are loopback latencies... Dont use -L >=20 >> # uname -a >> Linux erd 2.6.30.5 #2 SMP Mon Sep 7 17:15:43 CEST 2009 i686 i686 i38= 6 GNU/Linux >> >> >> >> I tried an old kernel on same hardware : >> >> # ./mcast -L -n1 -r 10000 >> WARNING: Multiple active ethernet devices. Using local address 55.22= 5.18.6 >> Receiver: Listening to control channel 239.0.192.1 >> Receiver: Subscribing to 1 MC addresses 239.0.192-254.2-254 offset 0= origin 55.225.18.6 >> Sender: Sending 10000 msgs/ch/sec on 1 channels. Probe interval=3D0.= 001-1 sec. >> >> TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/u= s StdDv >> 99999 0 0 0 9998 0.0 9.00 9.95 14.5= 0 1.56 >> >> Linux erd 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686 i6= 86 i386 GNU/Linux >> >> So my numbers seem much better than yours... >=20 > These are loopback numbers. Why is 2.6.9 so high? The regression show= at > less than 10k. >=20 > Could you use real NICs? With multiple TX queues and all the other co= ol > stuff? And run at lower packet rates? well, on your mono-flow test, multiqueue wont help anyway. >=20 > My loopback numbers also show the same trends. >=20 > 2.6.22: >=20 > mcast -Ln1 > TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/us= StdDv > 101 0 0 0 10 3.0 5.47 5.74 7.00= 0.43 >=20 > mcast -Ln1 -r10000 > TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/us= StdDv > 100000 0 0 0 10000 3000.0 5.97 6.11 6.40= 0.13 >=20 >=20 > 2.6.31-rc9 >=20 > mcast -Ln1 > TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/us= StdDv > 100 0 0 0 10 3.0 13.26 13.45 13.56= 0.09 >=20 > mcast -Ln1 -r10000 > TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/us= StdDv > 100000 0 0 0 10000 3000.0 5.70 5.82 5.91= 0.07 >=20 >=20 > So 2.6.22 is better at 10 msgs per second. 2.6.31 is slightly better = at > 10k. Unfortunatly there is too much noise on this test # uname -a Linux nag001 2.6.26-2-amd64 #1 SMP Fri Mar 27 04:02:59 UTC 2009 x86_64 = GNU/Linux # ./mcast -Ln1 -C WARNING: Multiple active ethernet devices. Using local address 55.225.1= 8.110 Receiver: Listening to control channel 239.0.192.1 Receiver: Subscribing to 1 MC addresses 239.0.192-254.2-254 offset 0 or= igin 55.225.18.110 Sender: Sending 10 msgs/ch/sec on 1 channels. Probe interval=3D0.001-1 = sec. TotalMsg Lost SeqErr TXDrop Msg/Sec KB/Sec Min/us Avg/us Max/us S= tdDv 100 0 0 0 10 0.0 8.78 10.56 18.20 = 2.62 100 0 0 0 10 3.0 8.60 9.33 9.88 = 0.45 101 0 0 0 10 3.0 8.42 9.47 11.53 = 0.85 100 0 0 0 10 3.0 8.37 10.19 14.85 = 1.74 101 0 0 0 10 3.0 8.86 11.23 18.00 = 3.11 100 0 0 0 10 3.0 8.43 9.72 11.48 = 0.82 101 0 0 0 10 3.0 9.02 29.78 210.25 6= 0.16 100 0 0 0 10 3.0 8.47 10.27 11.69 = 0.99 100 0 0 0 10 3.0 9.21 10.16 12.15 = 0.84 101 0 0 0 10 3.0 8.46 10.65 17.88 = 2.64 101 0 0 0 10 3.0 8.65 10.00 11.32 = 0.86 100 0 0 0 10 3.0 8.98 9.71 10.77 = 0.58 100 0 0 0 10 3.0 8.38 11.73 19.06 = 3.74 Also I believe you hit scheduler artefact at very low rate, since there are two tasks that are considered as coupled. Check commit 6f3d09291b4982991680b61763b2541e53e2a95f sched, net: socket wakeups are sync 'sync' wakeups are a hint towards the scheduler that (certain) networking related wakeups likely create coupling between tasks. Signed-off-by: Ingo Molnar diff --git a/net/core/sock.c b/net/core/sock.c index 09cb3a7..2654c14 100644 (file) --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1621,7 +1621,7 @@ static void sock_def_readable(struct sock *sk, in= t len) { read_lock(&sk->sk_callback_lock); if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) - wake_up_interruptible(sk->sk_sleep); + wake_up_interruptible_sync(sk->sk_sleep); sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN); read_unlock(&sk->sk_callback_lock); } @@ -1635,7 +1635,7 @@ static void sock_def_write_space(struct sock *sk) */ if ((atomic_read(&sk->sk_wmem_alloc) << 1) <=3D sk->sk_sndbuf) = { if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) - wake_up_interruptible(sk->sk_sleep); + wake_up_interruptible_sync(sk->sk_sleep); =20 /* Should agree with poll, otherwise some programs brea= k */ if (sock_writeable(sk))