>>> On Tue, Jun 17, 2008 at 12:56 AM, in message <20080616.215632.119969915.davem@davemloft.net>, David Miller wrote: > From: "Gregory Haskins" > Date: Mon, 16 Jun 2008 22:01:13 -0600 > >> This seemed odd to us, so we investigated further to see if an >> improvement was lurking or whether this was expected. We traced >> back the source of each wakeup to be coming from 1) the wmem/nospace >> code, and 2) from the rx-wakeup code from the softirq. First the >> softirq would process the tx-completions which would wake_up() the >> wait-queue for NOSPACE signaling. Since the client was waiting for >> a packet on the same wait-queue, this was where the first wakeup >> came from. Then later the softirq finally pushed an actual packet >> to the queue, and the client was once again re-awoken via the same >> overloaded wait-queue. This time it would successfully find a >> packet and return to userspace. >> >> Since the client does not care about wmem/nospace in the UDP rx >> path, yet the two events share a single wait-queue, the first wakeup >> was completely wasted. It just causes extra scheduling activity >> that does not help in any way (and is quite expensive in the >> grand-scheme of things). Based on this lead, Pat devised a solution >> which eliminates the extra wake-up() when there are no clients >> waiting for that particular NOSPACE event. With his patch applied, >> we observed two things: > > Why is the application checking for receive packets even on the > write-space wakeup? > > poll/select/epoll should be giving the correct event indication, > therefore the application would know to not check for receive > packets when a write-wakeup event occurs. > > Yes the wakeup is spurious and we should avoid it. But this > application is also buggy. The application is blocked inside a system call (I forget which one right now..probably recv()). So the wakeup is not against a poll/select. Rather, the kernel is in net/core/datagram.c::wait_for_packet() (blocked on skb->sk_sleep). Since both the wmem code and the rx code use skb->sk_sleep to wake up waiters, the wmem processing inadvertently kicks the client to go through __skb_recv_datagram() one more time. And since there aren't yet any packets in skb->sk_receive_queue, the client loops and once again calls wait_for_packet(). So long story short: This is entirely a kernel-space issue (unless you believe the usage of that system-call itself is a bug?) HTH Regards, -Greg