linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	aquini@redhat.com, Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michel Lespinasse <walken@google.com>,
	linux-tip-commits@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line
Date: Wed, 13 Feb 2013 17:21:08 -0800	[thread overview]
Message-ID: <CA+55aFws80EEJQskqiP+a5J-HOGu=M1Fe=uCs50ifedVxPxT1Q@mail.gmail.com> (raw)
In-Reply-To: <511C24A6.8020409@redhat.com>

On Wed, Feb 13, 2013 at 3:41 PM, Rik van Riel <riel@redhat.com> wrote:
>
> I have an example of the second case. It is a test case
> from a customer issue, where an application is contending on
> semaphores, doing semaphore lock and unlock operations. The
> test case simply has N threads, trying to lock and unlock the
> same semaphore.
>
> The attached graph (which I sent out with the intro email to
> my patches) shows how reducing the memory accesses from the
> spinlock wait path prevents the large performance degradation
> seen with the vanilla kernel. This is on a 24 CPU system with
> 4 6-core AMD CPUs.
>
> The "prop-N" series are with a fixed delay proportional back-off.
> You can see that a small value of N does not help much for large
> numbers of cpus, and a large value hurts with a small number of
> CPUs. The automatic tuning appears to be quite robust.

Ok, good, so there are some numbers. I didn't see any in the commit
messages anywhere, and since the threads I've looked at are from
tip-bot, I never saw the intro email.

That said, it's interesting that this happens with the semaphore path.
We've had other cases where the spinlock in the *sleeping* locks have
caused problems, and I wonder if we should look at that path in
particular.

> If we have only a few CPUs contending on the lock, the delays
> will be short.

Yes. I'm more worried about the overhead, especially on I$ (and to a
lesser degree on D$ when loading hashed delay values etc). I don't
believe it would ever loop very long, it's the other overhead I'd be
worried about.

>From looking at profiles of the kernel loads I've cared about (ie
largely VFS code), the I$ footprint seems to be a big deal, and
function entry (and the instruction *after* a call instruction)
actually tend to be hotspots. Which is why I care about things like
function prologues for leaf functions etc.

> Furthermore, the CPU at the head of the queue
> will run the old spinlock code with just cpu_relax() and checking
> the lock each iteration.

That's not AT ALL TRUE.

Look at the code you wrote. It does all the spinlock delay etc crap
unconditionally. Only the loop itself is conditional.

IOW, exactly all the overhead that I worry about. The function call,
the pointless turning of leaf functions into non-leaf functions, the
loading (and storing) of delay information etc etc.

The non-leaf-function thing is done even if you never hit the
slow-path, and affects the non-contention path. And the delay
information thing is done even if there is only one waiter on the
spinlock.

Did I miss anything?

> Eric got a 45% increase in network throughput, and I saw a factor 4x
> or so improvement with the semaphore test. I realize these are not
> "real workloads", and I will give you numbers with those once I have
> gathered some, on different systems.

Good. This is what I want to see.

> Are there significant cases where "perf -g" is not easily available,
> or harmful to tracking down the performance issue?

Yes. There are lots of machines where you cannot get call chain
information with CPU event buffers (pebs). And without the CPU event
buffers, you cannot get good profile data at all.

Now, on other machines you get the call chain even with pebs because
you can get the whole

> The cause of that was identified (with pause loop exiting, the  host
> effectively does the back-off for us), and the problem is avoided
> by limiting the maximum back-off value to something small on
> virtual guests.

And what if the hardware does something equivalent even when not
virtualized (ie power optimizations I already mentioned)?  That whole
maximum back-off limit seems to be just for known virtualization
issues. This is the kind of thing that makes me worry..

                   Linus

  reply	other threads:[~2013-02-14  1:21 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-06 20:03 [PATCH -v5 0/5] x86,smp: make ticket spinlock proportional backoff w/ auto tuning Rik van Riel
2013-02-06 20:04 ` [PATCH -v5 1/5] x86,smp: move waiting on contended ticket lock out of line Rik van Riel
2013-02-13 12:06   ` [tip:core/locking] x86/smp: Move " tip-bot for Rik van Riel
2013-02-13 16:20     ` Linus Torvalds
2013-02-13 18:30       ` Linus Torvalds
2013-02-14  0:54         ` H. Peter Anvin
2013-02-14  1:31           ` Linus Torvalds
2013-02-14  1:56             ` H. Peter Anvin
2013-02-14 10:50         ` Ingo Molnar
2013-02-14 16:10           ` Linus Torvalds
2013-02-15 15:57             ` Ingo Molnar
2013-02-15  6:48         ` Benjamin Herrenschmidt
2013-02-13 19:08       ` Rik van Riel
2013-02-13 19:36         ` Linus Torvalds
2013-02-13 22:21           ` Rik van Riel
2013-02-13 22:40             ` Linus Torvalds
2013-02-13 23:41               ` Rik van Riel
2013-02-14  1:21                 ` Linus Torvalds [this message]
2013-02-14  1:46                   ` Linus Torvalds
2013-02-14 10:43                   ` Ingo Molnar
2013-02-27 16:42                   ` Rik van Riel
2013-02-27 17:10                     ` Linus Torvalds
2013-02-27 19:53                       ` Rik van Riel
2013-02-27 20:18                         ` Linus Torvalds
2013-02-27 21:55                           ` Rik van Riel
     [not found]                             ` <CA+55aFwa0EjGG2NUDYVLVBmXJa2k81YiuNO2yggk=GLRQxhhUQ@mail.gmail.com>
2013-02-28  2:58                               ` Rik van Riel
2013-02-28  3:19                                 ` Linus Torvalds
2013-02-28  4:06                                 ` Davidlohr Bueso
2013-02-28  4:49                                   ` Linus Torvalds
2013-02-28 15:13                                     ` Rik van Riel
2013-02-28 18:22                                       ` Linus Torvalds
2013-02-28 20:26                                         ` Linus Torvalds
2013-02-28 21:14                                           ` Rik van Riel
2013-02-28 21:58                                             ` Linus Torvalds
2013-02-28 22:38                                               ` Rik van Riel
2013-02-28 23:09                                                 ` Linus Torvalds
2013-03-01  6:42                                                   ` Rik van Riel
2013-03-01 18:18                                                     ` Davidlohr Bueso
2013-03-01 18:50                                                       ` Rik van Riel
2013-03-01 18:52                                                       ` Linus Torvalds
2013-02-06 20:04 ` [PATCH -v5 2/5] x86,smp: proportional backoff for ticket spinlocks Rik van Riel
2013-02-13 12:07   ` [tip:core/locking] x86/smp: Implement " tip-bot for Rik van Riel
2013-02-06 20:05 ` [PATCH -v5 3/5] x86,smp: auto tune spinlock backoff delay factor Rik van Riel
2013-02-13 12:08   ` [tip:core/locking] x86/smp: Auto " tip-bot for Rik van Riel
2013-02-06 20:06 ` [PATCH -v5 4/5] x86,smp: keep spinlock delay values per hashed spinlock address Rik van Riel
2013-02-13 12:09   ` [tip:core/locking] x86/smp: Keep " tip-bot for Eric Dumazet
2013-02-06 20:07 ` [PATCH -v5 5/5] x86,smp: limit spinlock delay on virtual machines Rik van Riel
2013-02-07 11:11   ` Ingo Molnar
2013-02-07 21:24     ` [PATCH fix " Rik van Riel
2013-02-13 12:10       ` [tip:core/locking] x86/smp: Limit " tip-bot for Rik van Riel
2013-02-07 11:25   ` [PATCH -v5 5/5] x86,smp: limit " Stefano Stabellini
2013-02-07 11:59     ` Raghavendra K T
2013-02-07 13:28     ` Rik van Riel
2013-02-06 20:08 ` [PATCH -v5 6/5] x86,smp: add debugging code to track spinlock delay value Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFws80EEJQskqiP+a5J-HOGu=M1Fe=uCs50ifedVxPxT1Q@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).