All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: kchang@athenacr.com
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, cl@linux-foundation.org,
	Brian Bloniarz <bmb@athenacr.com>
Subject: Re: Multicast packet loss
Date: Sat, 07 Mar 2009 08:46:52 +0100	[thread overview]
Message-ID: <49B2266C.9050701@cosmosbay.com> (raw)
In-Reply-To: <49AE3DA9.2020103@cosmosbay.com>

Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@cosmosbay.com>
>> Date: Sat, 28 Feb 2009 09:51:11 +0100
>>
>>> David, this is a preliminary work, not meant for inclusion as is,
>>> comments are welcome.
>>>
>>> [PATCH] net: sk_forward_alloc becomes an atomic_t
>>>
>>> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
>>> (UDP: Add memory accounting) introduced a regression for high rate UDP flows,
>>> because of extra lock_sock() in udp_recvmsg()
>>>
>>> In order to reduce need for lock_sock() in UDP receive path, we might need
>>> to declare sk_forward_alloc as an atomic_t.
>>>
>>> udp_recvmsg() can avoid a lock_sock()/release_sock() pair.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>> This adds new overhead for TCP which has to hold the socket
>> lock for other reasons in these paths.
>>
>> I don't get how an atomic_t operation is cheaper than a
>> lock_sock/release_sock.  Is it the case that in many
>> executions of these paths only atomic_read()'s are necessary?
>>
>> I actually think this scheme is racy.  There is a reason we
>> have to hold the socket lock when doing memory scheduling.
>> Two threads can get in there and say "hey I have enough space
>> already" even though only enough space is allocated for one
>> of their requests.
>>
>> What did I miss? :)
>>
> 
> I believe you are right, and in fact was about to post a "dont look at this patch"
> since it doesnt help the multicast reception at all, I redone tests more carefuly 
> and got nothing but noise.
> 
> We have a cache line ping pong mess here, and need more thinking.
> 
> I rewrote Kenny prog to use non blocking sockets.
> 
> Receivers are doing :
> 
>         int delay = 50;
> 	fcntl(s, F_SETFL, O_NDELAY);
>         while(1)
>         {
>             struct sockaddr_in from;
>             socklen_t fromlen = sizeof(from);
>             res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen);
>             if (res == -1) {
>                       delay++;
>                       usleep(delay);
>                       continue;
>             }
>             if (delay > 40)
>                 delay--;
>             ++npackets;
> 
> With this litle user space change and 8 receivers on my dual quad core, softirqd
> only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops)
> 
> So this is definitly a problem mixing scheduler cache line ping pongs with network
> stack cache line ping pongs.
> 
> We could reorder fields so that fewer cache lines are touched by the softirq processing,
> I tried this but still got packet drops.
> 
> 
> 

I have more questions :

What is the maximum latency you can afford on the delivery of the packet(s) ?

Are user apps using real time scheduling ?

I had an idea, that keep cpu handling NIC interrupts only delivering packets to
socket queues, and not messing with scheduler : fast queueing, and wakeing up
a workqueue (on another cpu) to perform the scheduler work. But that means
some extra latency (in the order of 2 or 3 us I guess)

We could enter in this mode automatically, if the NIC rx handler *see* more than
N packets are waiting in NIC queue : In case of moderate or light trafic, no
extra latency would be necessary. This would mean some changes in NIC driver.

Hum, then, if NIC rx handler is run beside the ksoftirqd, we already know
we are in a stress situation, so maybe no driver changes are necessary :
Just test if we run ksoftirqd...



  reply	other threads:[~2009-03-07  7:47 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-30 17:49 Multicast packet loss Kenny Chang
2009-01-30 19:04 ` Eric Dumazet
2009-01-30 19:17 ` Denys Fedoryschenko
2009-01-30 20:03 ` Neil Horman
2009-01-30 22:29   ` Kenny Chang
2009-01-30 22:41     ` Eric Dumazet
2009-01-31 16:03       ` Neil Horman
2009-02-02 16:13         ` Kenny Chang
2009-02-02 16:48         ` Kenny Chang
2009-02-03 11:55           ` Neil Horman
2009-02-03 15:20             ` Kenny Chang
2009-02-04  1:15               ` Neil Horman
2009-02-04 16:07                 ` Kenny Chang
2009-02-04 16:46                   ` Wesley Chow
2009-02-04 18:11                     ` Eric Dumazet
2009-02-05 13:33                       ` Neil Horman
2009-02-05 13:46                         ` Wesley Chow
2009-02-05 13:29                   ` Neil Horman
2009-02-01 12:40       ` Eric Dumazet
2009-02-02 13:45         ` Neil Horman
2009-02-02 16:57           ` Eric Dumazet
2009-02-02 18:22             ` Neil Horman
2009-02-02 19:51               ` Wes Chow
2009-02-02 20:29                 ` Eric Dumazet
2009-02-02 21:09                   ` Wes Chow
2009-02-02 21:31                     ` Eric Dumazet
2009-02-03 17:34                       ` Kenny Chang
2009-02-04  1:21                         ` Neil Horman
2009-02-26 17:15                           ` Kenny Chang
2009-02-28  8:51                             ` Eric Dumazet
2009-03-01 17:03                               ` Eric Dumazet
2009-03-04  8:16                               ` David Miller
2009-03-04  8:36                                 ` Eric Dumazet
2009-03-07  7:46                                   ` Eric Dumazet [this message]
2009-03-08 16:46                                     ` Eric Dumazet
2009-03-09  2:49                                       ` David Miller
2009-03-09  6:36                                         ` Eric Dumazet
2009-03-13 21:51                                           ` David Miller
2009-03-13 22:30                                             ` Eric Dumazet
2009-03-13 22:38                                               ` David Miller
2009-03-13 22:45                                                 ` Eric Dumazet
2009-03-14  9:03                                                   ` [PATCH] net: reorder fields of struct socket Eric Dumazet
2009-03-16  2:59                                                     ` David Miller
2009-03-16 22:22                                                 ` Multicast packet loss Eric Dumazet
2009-03-17 10:11                                                   ` Peter Zijlstra
2009-03-17 11:08                                                     ` Eric Dumazet
2009-03-17 11:57                                                       ` Peter Zijlstra
2009-03-17 15:00                                                       ` Brian Bloniarz
2009-03-17 15:16                                                         ` Eric Dumazet
2009-03-17 19:39                                                           ` David Stevens
2009-03-17 21:19                                                             ` Eric Dumazet
2009-04-03 19:28                                                   ` Brian Bloniarz
2009-04-05 13:49                                                     ` Eric Dumazet
2009-04-06 21:53                                                       ` Brian Bloniarz
2009-04-06 22:12                                                         ` Brian Bloniarz
2009-04-07 20:08                                                       ` Brian Bloniarz
2009-04-08  8:12                                                         ` Eric Dumazet
2009-03-09 22:56                                       ` Brian Bloniarz
2009-03-10  5:28                                         ` Eric Dumazet
2009-03-10 23:22                                           ` Brian Bloniarz
2009-03-11  3:00                                             ` Eric Dumazet
2009-03-12 15:47                                               ` Brian Bloniarz
2009-03-12 16:34                                                 ` Eric Dumazet
2009-02-27 18:40       ` Christoph Lameter
2009-02-27 18:56         ` Eric Dumazet
2009-02-27 19:45           ` Christoph Lameter
2009-02-27 20:12             ` Eric Dumazet
2009-02-27 21:36               ` Eric Dumazet
2009-02-02 13:53     ` Eric Dumazet
  -- strict thread matches above, loose matches on Subject: below --
2009-04-05 14:42 bmb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49B2266C.9050701@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=bmb@athenacr.com \
    --cc=cl@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=kchang@athenacr.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.