From: Gregory Haskins <gregory.haskins@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: herbert@gondor.apana.org.au, pmullaney@novell.com,
chuck.lever@oracle.com, netdev@vger.kernel.org,
Gregory Haskins <ghaskins@novell.com>
Subject: Re: Killing sk->sk_callback_lock
Date: Tue, 17 Jun 2008 07:42:44 -0400 [thread overview]
Message-ID: <4857A334.5020501@gmail.com> (raw)
In-Reply-To: <20080616.215632.119969915.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 2594 bytes --]
>>> On Tue, Jun 17, 2008 at 12:56 AM, in message
<20080616.215632.119969915.davem@davemloft.net>, David Miller
<davem@davemloft.net> wrote:
> From: "Gregory Haskins" <ghaskins@novell.com>
> Date: Mon, 16 Jun 2008 22:01:13 -0600
>
>> This seemed odd to us, so we investigated further to see if an
>> improvement was lurking or whether this was expected. We traced
>> back the source of each wakeup to be coming from 1) the wmem/nospace
>> code, and 2) from the rx-wakeup code from the softirq. First the
>> softirq would process the tx-completions which would wake_up() the
>> wait-queue for NOSPACE signaling. Since the client was waiting for
>> a packet on the same wait-queue, this was where the first wakeup
>> came from. Then later the softirq finally pushed an actual packet
>> to the queue, and the client was once again re-awoken via the same
>> overloaded wait-queue. This time it would successfully find a
>> packet and return to userspace.
>>
>> Since the client does not care about wmem/nospace in the UDP rx
>> path, yet the two events share a single wait-queue, the first wakeup
>> was completely wasted. It just causes extra scheduling activity
>> that does not help in any way (and is quite expensive in the
>> grand-scheme of things). Based on this lead, Pat devised a solution
>> which eliminates the extra wake-up() when there are no clients
>> waiting for that particular NOSPACE event. With his patch applied,
>> we observed two things:
>
> Why is the application checking for receive packets even on the
> write-space wakeup?
>
> poll/select/epoll should be giving the correct event indication,
> therefore the application would know to not check for receive
> packets when a write-wakeup event occurs.
>
> Yes the wakeup is spurious and we should avoid it. But this
> application is also buggy.
The application is blocked inside a system call (I forget which one
right now..probably recv()). So the wakeup is not against a
poll/select. Rather, the kernel is in
net/core/datagram.c::wait_for_packet() (blocked on skb->sk_sleep).
Since both the wmem code and the rx code use skb->sk_sleep to wake up
waiters, the wmem processing inadvertently kicks the client to go
through __skb_recv_datagram() one more time. And since there aren't yet
any packets in skb->sk_receive_queue, the client loops and once again
calls wait_for_packet().
So long story short: This is entirely a kernel-space issue (unless you
believe the usage of that system-call itself is a bug?)
HTH
Regards,
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
next prev parent reply other threads:[~2008-06-17 11:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <484F43A3020000760003F543@lucius.provo.novell.com>
2008-06-17 1:38 ` Killing sk->sk_callback_lock (was Re: [PATCH] net/core/sock.c remove extra wakeup) Patrick Mullaney
2008-06-17 1:53 ` Killing sk->sk_callback_lock David Miller
2008-06-17 4:01 ` Gregory Haskins
2008-06-17 4:09 ` David Miller
2008-06-17 4:20 ` Gregory Haskins
2008-06-17 4:30 ` Gregory Haskins
2008-06-17 4:22 ` Fwd: " Gregory Haskins
2008-06-17 4:56 ` David Miller
2008-06-17 11:42 ` Gregory Haskins [this message]
2008-06-17 13:38 ` Patrick Mullaney
2008-06-17 21:40 ` David Miller
2008-06-17 22:15 ` Patrick Mullaney
2008-06-17 23:24 ` Herbert Xu
2008-06-18 7:36 ` Remi Denis-Courmont
2008-06-17 2:56 ` Killing sk->sk_callback_lock (was Re: [PATCH] net/core/sock.c remove extra wakeup) Herbert Xu
2008-06-17 3:09 ` Eric Dumazet
2008-06-17 23:33 ` Herbert Xu
[not found] <4857F579020000B30004963C@lucius.provo.novell.com>
2008-06-18 16:49 ` Patrick Mullaney
2008-06-18 21:56 ` Killing sk->sk_callback_lock David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4857A334.5020501@gmail.com \
--to=gregory.haskins@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=ghaskins@novell.com \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
--cc=pmullaney@novell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.