Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hp.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	virtualization@lists.linux-foundation.org,
	Andi Kleen <andi@firstfloor.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Michel Lespinasse <walken@google.com>,
	Alok Kataria <akataria@vmware.com>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Scott J Norton <scott.norton@hp.com>,
	xen-devel@lists.xenproject.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Alexander Fyodorov <halcy@yandex.ru>,
	Rik van Riel <riel@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Daniel J Blueman <daniel@numascale.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Chris Wright <chrisw@sous-sol.org>,
	George Spelvin <linux@horizon.com>,
	Thomas Gleixner <tglx@linutro>
Subject: Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
Date: Tue, 04 Mar 2014 10:27:03 -0500	[thread overview]
Message-ID: <5315F0C7.8090909@hp.com> (raw)
In-Reply-To: <20140303174305.GK9987@twins.programming.kicks-ass.net>

On 03/03/2014 12:43 PM, Peter Zijlstra wrote:
> Hi,
>
> Here are some numbers for my version -- also attached is the test code.
> I found that booting big machines is tediously slow so I lifted the
> whole lot to userspace.
>
> I measure the cycles spend in arch_spin_lock() + arch_spin_unlock().
>
> The machines used are a 4 node (2 socket) AMD Interlagos, and a 2 node
> (2 socket) Intel Westmere-EP.
>
> AMD (ticket)		AMD (qspinlock + pending + opt)
>
> Local:                  Local:
>
> 1:    324.425530        1:    324.102142
> 2:  17141.324050        2:    620.185930
> 3:  52212.232343        3:  25242.574661
> 4:  93136.458314        4:  47982.037866
> 6: 167967.455965        6:  95345.011864
> 8: 245402.534869        8: 142412.451438
>
> 2 - nodes:              2 - nodes:
>
> 2:  12763.640956        2:   1879.460823
> 4:  94423.027123        4:  48278.719130
> 6: 167903.698361        6:  96747.767310
> 8: 257243.508294        8: 144672.846317
>
> 4 - nodes:              4 - nodes:
>
>   4:  82408.853603        4:  49820.323075
>   8: 260492.952355        8: 143538.264724
> 16: 630099.031148       16: 337796.553795
>
>
>
> Intel (ticket)		Intel (qspinlock + pending + opt)
>
> Local:                  Local:
>
> 1:    19.002249         1:    29.002844
> 2:  5093.275530         2:  1282.209519
> 3: 22300.859761         3: 22127.477388
> 4: 44929.922325         4: 44493.881832
> 6: 86338.755247         6: 86360.083940
>
> 2 - nodes:              2 - nodes:
>
> 2:   1509.193824        2:   1209.090219
> 4:  48154.495998        4:  48547.242379
> 8: 137946.787244        8: 141381.498125
>
> ---
>
> There a few curious facts I found (assuming my test code is sane).
>
>   - Intel seems to be an order of magnitude faster on uncontended LOCKed
>     ops compared to AMD
>
>   - On Intel the uncontended qspinlock fast path (cmpxchg) seems slower
>     than the uncontended ticket xadd -- although both are plenty fast
>     when compared to AMD.
>
>   - In general, replacing cmpxchg loops with unconditional atomic ops
>     doesn't seem to matter a whole lot when the thing is contended.
>
> Below is the (rather messy) qspinlock slow path code (the only thing
> that really differs between our versions.
>
> I'll try and slot your version in tomorrow.
>
> ---
>

It is curious to see that the qspinlock code offers a big benefit on AMD 
machines, but no so much on Intel. Anyway, I am working on a revised 
version of the patch that includes some of your comments. I will also 
try to see if I can get an AMD machine to run test on.

-Longman

next prev parent reply	other threads:[~2014-03-04 15:27 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-26 15:14 [PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support Waiman Long
2014-02-26 15:14 ` [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation Waiman Long
2014-02-26 16:22   ` Peter Zijlstra
2014-02-27 20:25     ` Waiman Long
2014-02-26 16:24   ` Peter Zijlstra
2014-02-27 20:25     ` Waiman Long
2014-02-26 15:14 ` [PATCH v5 2/8] qspinlock, x86: Enable x86-64 to use queue spinlock Waiman Long
2014-02-26 15:14 ` [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks Waiman Long
2014-02-26 16:20   ` Peter Zijlstra
2014-02-27 20:42     ` Waiman Long
2014-02-28  9:29       ` Peter Zijlstra
2014-02-28 16:25         ` Linus Torvalds
2014-02-28 17:37           ` Peter Zijlstra
2014-02-28 16:38         ` Waiman Long
2014-02-28 17:56           ` Peter Zijlstra
2014-03-03 17:43           ` Peter Zijlstra
2014-03-04 15:27             ` Waiman Long [this message]
2014-03-04 16:58             ` Peter Zijlstra
2014-03-04 18:09               ` Peter Zijlstra
2014-03-04 17:48             ` Waiman Long
2014-03-04 22:40               ` Peter Zijlstra
2014-03-05 20:59                 ` Peter Zijlstra
2014-02-26 15:14 ` [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment Waiman Long
2014-02-26 17:07   ` Konrad Rzeszutek Wilk
2014-02-28 17:06     ` Waiman Long
2014-03-03 10:55       ` Paolo Bonzini
2014-03-04 15:15         ` Waiman Long
2014-03-04 15:23           ` Paolo Bonzini
2014-03-04 15:39           ` David Vrabel
2014-03-04 17:50           ` Raghavendra K T
2014-02-27 12:28   ` David Vrabel
2014-02-27 19:40     ` Waiman Long
2014-02-26 15:14 ` [PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest Waiman Long
2014-02-26 17:08   ` Konrad Rzeszutek Wilk
2014-02-28 17:08     ` Waiman Long
2014-02-27  9:41   ` Paolo Bonzini
2014-02-27 19:05     ` Waiman Long
2014-02-27 10:40   ` Raghavendra K T
2014-02-27 19:12     ` Waiman Long
2014-02-26 15:14 ` [PATCH RFC v5 6/8] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Waiman Long
2014-02-26 15:14 ` [PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support Waiman Long
2014-02-26 17:54   ` Konrad Rzeszutek Wilk
2014-02-27 12:11   ` David Vrabel
2014-02-27 13:11     ` Paolo Bonzini
2014-02-27 14:18       ` David Vrabel
2014-02-27 14:45         ` Paolo Bonzini
2014-02-27 15:22           ` Raghavendra K T
2014-02-27 15:50             ` Paolo Bonzini
2014-03-03 11:06               ` [Xen-devel] " David Vrabel
2014-02-27 20:50             ` Waiman Long
2014-02-27 19:42           ` Waiman Long
2014-02-26 15:14 ` [PATCH RFC v5 8/8] pvqspinlock, x86: Enable KVM to use qspinlock's PV support Waiman Long
2014-02-27  9:31   ` Paolo Bonzini
2014-02-27 18:36     ` Waiman Long
2014-02-26 17:00 ` [PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with " Konrad Rzeszutek Wilk
2014-02-28 16:56   ` Waiman Long
2014-02-26 22:26 ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2014-02-27  4:32 Waiman Long
2014-02-27  4:32 ` [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks Waiman Long
2014-03-02 13:16   ` Oleg Nesterov
2014-03-04 14:54     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5315F0C7.8090909@hp.com \
    --to=waiman.long@hp.com \
    --cc=akataria@vmware.com \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=chrisw@sous-sol.org \
    --cc=daniel@numascale.com \
    --cc=halcy@yandex.ru \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux@horizon.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutro \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).