Fwd: Re: Killing sk->sk_callback_lock

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Gregory Haskins" <ghaskins@novell.com>
To: <davem@davemloft.net>, "Patrick Mullaney" <PMullaney@novell.com>
Cc: "Herbert Xu" <herbert@gondor.apana.org.au>,
	<chuck.lever@oracle.com>, <netdev@vger.kernel.org>
Subject: Fwd: Re: Killing sk->sk_callback_lock
Date: Mon, 16 Jun 2008 22:22:35 -0600	[thread overview]
Message-ID: <485703CB.BA47.005A.0@novell.com> (raw)
In-Reply-To: 4856FEC9.BA47.005A.0@novell.com

Oddly enough, forwarding an email does seem to linewrap..so perhaps this one is easier to read:

Regards,
-Greg

>>> On Tue, Jun 17, 2008 at 12:01 AM, in message <4856FEC9.BA47.005A.0@novell.com>,
Gregory Haskins wrote: 
> >>> On Mon, Jun 16, 2008 at  9:53 PM, in message
> <20080616.185328.85842051.davem@davemloft.net>, David Miller
> <davem@davemloft.net> wrote: 
>> From: "Patrick Mullaney" <pmullaney@novell.com>
>> Date: Mon, 16 Jun 2008 19:38:23 -0600
>> 
>>> The overhead I was trying to address was scheduler overhead.
>> 
>> Neither Herbert nor I are convinced of this yet, and you have
>> to show us why you think this is the problem and not (in
>> our opinion) the more likely sk_callback_lock overhead.
> 
> Please bear with us.  It is not our intent to be annoying, but we are 
> perhaps doing a poor job of describing the actual nature of the issue we are 
> seeing.
> 
> To be clear on how we got to this point: We are tasked with improving the 
> performance of our particular kernel configuration.  We observed that our 
> configuration was comparatively poor at UDP performance, so we started 
> investigating why.  We instrumented the kernel using various tools (lockdep, 
> oprofile, logdev, etc), and observed two wakeups for every packet that was 
> received while running a multi-threaded netperf UDP throughput benchmark on 
> some 8-core boxes.
> 
> A common pattern would emerge during analysis of the instrumentation data: 
> An arbitrary client thread would block on the wait-queue waiting for a UDP 
> packet.  It would of course initially block because there were no packets 
> available.  It would then wake up, check the queue, see that there are still 
> no packets available, and go back to sleep.  It would then wake up again a 
> short time later, find a packet, and return to userspace.
> 
> This seemed odd to us, so we investigated further to see if an improvement 
> was lurking or whether this was expected.  We traced back the source of each 
> wakeup to be coming from 1) the wmem/nospace code, and 2) from the rx-wakeup 
> code from the softirq.  First the softirq would process the tx-completions 
> which would wake_up() the wait-queue for NOSPACE signaling.  Since the client 
> was waiting for a packet on the same wait-queue, this was where the first 
> wakeup came from.  Then later the softirq finally pushed an actual packet to 
> the queue, and the client was once again re-awoken via the same overloaded 
> wait-queue.  This time it would successfully find a packet and return to 
> userspace.
> 
> Since the client does not care about wmem/nospace in the UDP rx path, yet 
> the two events share a single wait-queue, the first wakeup was completely 
> wasted.  It just causes extra scheduling activity that does not help in any 
> way (and is quite expensive in the grand-scheme of things).  Based on this 
> lead, Pat devised a solution which eliminates the extra wake-up() when there 
> are no clients waiting for that particular NOSPACE event.  With his patch 
> applied, we observed two things:
> 
> 1) We now had 1 wake-up per packet, instead of 2 (decreasing context 
> switching rates by ~50%)
> 2) overall UDP throughput performance increased by ~25%
> 
> This was true even without the presence of our instrumentation, so I don't 
> think we can chalk up the "double-wake" analysis as an anomaly caused by the 
> presence of the instrumentation itself.  Based on that, it would at least 
> appear that the odd behavior w.r.t. the phantom wakeup does indeed hinder 
> performance.  This is not to say that the locking issues you highlight are 
> not also an issue.  But note that we have no evidence to suggest this 
> particular phenomenon it is related in any way to the locking (in fact, 
> sk_callback_lock was not showing up at all on the lockdep radar for this 
> particular configuration, indicating a low contention rate).
> 
> So by all means, if there are improvements to the locking that can be made, 
> thats great!  But fixing the locking will not likely address this scheduler 
> overhead that Pat refers to IMO.  They would appear to be orthogonal issues.  
> I will keep an open mind, but the root-cause seems to be either the tendency 
> of the stack code to overload the wait-queue, or the fact that UDP sockets do 
> not dynamically manage the NOSPACE state flags.  From my perspective, I am 
> not married to any one particular solution as long as the fundamental 
> "phantom wakeup" problem is addressed.
> 
> HTH
> 
> Regards,
> -Greg
> 
>

next prev parent reply	other threads:[~2008-06-17  4:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <484F43A3020000760003F543@lucius.provo.novell.com>
2008-06-17  1:38 ` Killing sk->sk_callback_lock (was Re: [PATCH] net/core/sock.c remove extra wakeup) Patrick Mullaney
2008-06-17  1:53   ` Killing sk->sk_callback_lock David Miller
2008-06-17  4:01     ` Gregory Haskins
2008-06-17  4:09       ` David Miller
2008-06-17  4:20         ` Gregory Haskins
2008-06-17  4:30           ` Gregory Haskins
2008-06-17  4:22       ` Gregory Haskins [this message]
2008-06-17  4:56       ` David Miller
2008-06-17 11:42         ` Gregory Haskins
2008-06-17 13:38     ` Patrick Mullaney
2008-06-17 21:40       ` David Miller
2008-06-17 22:15         ` Patrick Mullaney
2008-06-17 23:24           ` Herbert Xu
2008-06-18  7:36         ` Remi Denis-Courmont
2008-06-17  2:56   ` Killing sk->sk_callback_lock (was Re: [PATCH] net/core/sock.c remove extra wakeup) Herbert Xu
2008-06-17  3:09     ` Eric Dumazet
2008-06-17 23:33   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=485703CB.BA47.005A.0@novell.com \
    --to=ghaskins@novell.com \
    --cc=PMullaney@novell.com \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.