From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: kchang@athenacr.com, netdev@vger.kernel.org, cl@linux-foundation.org
Subject: Re: Multicast packet loss
Date: Wed, 04 Mar 2009 09:36:57 +0100 [thread overview]
Message-ID: <49AE3DA9.2020103@cosmosbay.com> (raw)
In-Reply-To: <20090304.001646.100690134.davem@davemloft.net>
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Sat, 28 Feb 2009 09:51:11 +0100
>
>> David, this is a preliminary work, not meant for inclusion as is,
>> comments are welcome.
>>
>> [PATCH] net: sk_forward_alloc becomes an atomic_t
>>
>> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
>> (UDP: Add memory accounting) introduced a regression for high rate UDP flows,
>> because of extra lock_sock() in udp_recvmsg()
>>
>> In order to reduce need for lock_sock() in UDP receive path, we might need
>> to declare sk_forward_alloc as an atomic_t.
>>
>> udp_recvmsg() can avoid a lock_sock()/release_sock() pair.
>>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
> This adds new overhead for TCP which has to hold the socket
> lock for other reasons in these paths.
>
> I don't get how an atomic_t operation is cheaper than a
> lock_sock/release_sock. Is it the case that in many
> executions of these paths only atomic_read()'s are necessary?
>
> I actually think this scheme is racy. There is a reason we
> have to hold the socket lock when doing memory scheduling.
> Two threads can get in there and say "hey I have enough space
> already" even though only enough space is allocated for one
> of their requests.
>
> What did I miss? :)
>
I believe you are right, and in fact was about to post a "dont look at this patch"
since it doesnt help the multicast reception at all, I redone tests more carefuly
and got nothing but noise.
We have a cache line ping pong mess here, and need more thinking.
I rewrote Kenny prog to use non blocking sockets.
Receivers are doing :
int delay = 50;
fcntl(s, F_SETFL, O_NDELAY);
while(1)
{
struct sockaddr_in from;
socklen_t fromlen = sizeof(from);
res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen);
if (res == -1) {
delay++;
usleep(delay);
continue;
}
if (delay > 40)
delay--;
++npackets;
With this litle user space change and 8 receivers on my dual quad core, softirqd
only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops)
So this is definitly a problem mixing scheduler cache line ping pongs with network
stack cache line ping pongs.
We could reorder fields so that fewer cache lines are touched by the softirq processing,
I tried this but still got packet drops.
next prev parent reply other threads:[~2009-03-04 8:37 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-30 17:49 Multicast packet loss Kenny Chang
2009-01-30 19:04 ` Eric Dumazet
2009-01-30 19:17 ` Denys Fedoryschenko
2009-01-30 20:03 ` Neil Horman
2009-01-30 22:29 ` Kenny Chang
2009-01-30 22:41 ` Eric Dumazet
2009-01-31 16:03 ` Neil Horman
2009-02-02 16:13 ` Kenny Chang
2009-02-02 16:48 ` Kenny Chang
2009-02-03 11:55 ` Neil Horman
2009-02-03 15:20 ` Kenny Chang
2009-02-04 1:15 ` Neil Horman
2009-02-04 16:07 ` Kenny Chang
2009-02-04 16:46 ` Wesley Chow
2009-02-04 18:11 ` Eric Dumazet
2009-02-05 13:33 ` Neil Horman
2009-02-05 13:46 ` Wesley Chow
2009-02-05 13:29 ` Neil Horman
2009-02-01 12:40 ` Eric Dumazet
2009-02-02 13:45 ` Neil Horman
2009-02-02 16:57 ` Eric Dumazet
2009-02-02 18:22 ` Neil Horman
2009-02-02 19:51 ` Wes Chow
2009-02-02 20:29 ` Eric Dumazet
2009-02-02 21:09 ` Wes Chow
2009-02-02 21:31 ` Eric Dumazet
2009-02-03 17:34 ` Kenny Chang
2009-02-04 1:21 ` Neil Horman
2009-02-26 17:15 ` Kenny Chang
2009-02-28 8:51 ` Eric Dumazet
2009-03-01 17:03 ` Eric Dumazet
2009-03-04 8:16 ` David Miller
2009-03-04 8:36 ` Eric Dumazet [this message]
2009-03-07 7:46 ` Eric Dumazet
2009-03-08 16:46 ` Eric Dumazet
2009-03-09 2:49 ` David Miller
2009-03-09 6:36 ` Eric Dumazet
2009-03-13 21:51 ` David Miller
2009-03-13 22:30 ` Eric Dumazet
2009-03-13 22:38 ` David Miller
2009-03-13 22:45 ` Eric Dumazet
2009-03-14 9:03 ` [PATCH] net: reorder fields of struct socket Eric Dumazet
2009-03-16 2:59 ` David Miller
2009-03-16 22:22 ` Multicast packet loss Eric Dumazet
2009-03-17 10:11 ` Peter Zijlstra
2009-03-17 11:08 ` Eric Dumazet
2009-03-17 11:57 ` Peter Zijlstra
2009-03-17 15:00 ` Brian Bloniarz
2009-03-17 15:16 ` Eric Dumazet
2009-03-17 19:39 ` David Stevens
2009-03-17 21:19 ` Eric Dumazet
2009-04-03 19:28 ` Brian Bloniarz
2009-04-05 13:49 ` Eric Dumazet
2009-04-06 21:53 ` Brian Bloniarz
2009-04-06 22:12 ` Brian Bloniarz
2009-04-07 20:08 ` Brian Bloniarz
2009-04-08 8:12 ` Eric Dumazet
2009-03-09 22:56 ` Brian Bloniarz
2009-03-10 5:28 ` Eric Dumazet
2009-03-10 23:22 ` Brian Bloniarz
2009-03-11 3:00 ` Eric Dumazet
2009-03-12 15:47 ` Brian Bloniarz
2009-03-12 16:34 ` Eric Dumazet
2009-02-27 18:40 ` Christoph Lameter
2009-02-27 18:56 ` Eric Dumazet
2009-02-27 19:45 ` Christoph Lameter
2009-02-27 20:12 ` Eric Dumazet
2009-02-27 21:36 ` Eric Dumazet
2009-02-02 13:53 ` Eric Dumazet
-- strict thread matches above, loose matches on Subject: below --
2009-04-05 14:42 bmb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49AE3DA9.2020103@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=kchang@athenacr.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.