linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Waiman.Long@hp.com, linux-arch@vger.kernel.org, riel@redhat.com,
	gleb@redhat.com, kvm@vger.kernel.org, boris.ostrovsky@oracle.com,
	scott.norton@hp.com, raghavendra.kt@linux.vnet.ibm.com,
	paolo.bonzini@gmail.com, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Peter Zijlstra <peterz@infradead.org>,
	chegu_vinod@hp.com, david.vrabel@citrix.com, oleg@redhat.com,
	xen-devel@lists.xenproject.org, tglx@linutronix.de,
	paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org,
	mingo@kernel.org
Subject: Re: [PATCH 10/11] qspinlock: Paravirt support
Date: Fri, 20 Jun 2014 09:46:08 -0400	[thread overview]
Message-ID: <20140620134608.GA11545@laptop.dumpdata.com> (raw)
In-Reply-To: <20140615130154.213923590@chello.nl>

On Sun, Jun 15, 2014 at 02:47:07PM +0200, Peter Zijlstra wrote:
> Add minimal paravirt support.
> 
> The code aims for minimal impact on the native case.

Woot!
> 
> On the lock side we add one jump label (asm_goto) and 4 paravirt
> callee saved calls that default to NOPs. The only effects are the
> extra NOPs and some pointless MOVs to accomodate the calling
> convention. No register spills happen because of this (x86_64).
> 
> On the unlock side we have one paravirt callee saved call, which
> defaults to the actual unlock sequence: "movb $0, (%rdi)" and a NOP.
> 
> The actual paravirt code comes in 3 parts;
> 
>  - init_node; this initializes the extra data members required for PV
>    state. PV state data is kept 1 cacheline ahead of the regular data.
> 
>  - link_and_wait_node/kick_node; these are paired with the regular MCS
>    queueing and are placed resp. before/after the paired MCS ops.
> 
>  - wait_head/queue_unlock; the interesting part here is finding the
>    head node to kick.
> 
> Tracking the head is done in two parts, firstly the pv_wait_head will
> store its cpu number in whichever node is pointed to by the tail part
> of the lock word. Secondly, pv_link_and_wait_node() will propagate the
> existing head from the old to the new tail node.

I dug in the code and I have some comments about it, but before
I post them I was wondering if you have any plans to run any performance
tests against the PV ticketlock with normal and over-committed scenarios?

Looking at this with a pen and paper I see that compared to
PV ticketlock for the CPUs that are contending on the queue (so they
go to pv_wait_head_and_link, then progress to pv_wait_head), they
go sleep twice and get woken up twice. In PV ticketlock the
contending CPUs would only go to sleep once and woken up once it
was their turn.

That of course is the worst case scenario - where the CPU
that has the lock is taking forever to do its job and the
host is quite overcommitted.

Thanks!

WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Waiman.Long@hp.com, tglx@linutronix.de, mingo@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xenproject.org, kvm@vger.kernel.org,
	paolo.bonzini@gmail.com, boris.ostrovsky@oracle.com,
	paulmck@linux.vnet.ibm.com, riel@redhat.com,
	torvalds@linux-foundation.org, raghavendra.kt@linux.vnet.ibm.com,
	david.vrabel@citrix.com, oleg@redhat.com, gleb@redhat.com,
	scott.norton@hp.com, chegu_vinod@hp.com,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 10/11] qspinlock: Paravirt support
Date: Fri, 20 Jun 2014 09:46:08 -0400	[thread overview]
Message-ID: <20140620134608.GA11545@laptop.dumpdata.com> (raw)
Message-ID: <20140620134608.4rL0irEz7LhmmWKItc0RVQS7yFnWS3PBk1n1anGkubs@z> (raw)
In-Reply-To: <20140615130154.213923590@chello.nl>

On Sun, Jun 15, 2014 at 02:47:07PM +0200, Peter Zijlstra wrote:
> Add minimal paravirt support.
> 
> The code aims for minimal impact on the native case.

Woot!
> 
> On the lock side we add one jump label (asm_goto) and 4 paravirt
> callee saved calls that default to NOPs. The only effects are the
> extra NOPs and some pointless MOVs to accomodate the calling
> convention. No register spills happen because of this (x86_64).
> 
> On the unlock side we have one paravirt callee saved call, which
> defaults to the actual unlock sequence: "movb $0, (%rdi)" and a NOP.
> 
> The actual paravirt code comes in 3 parts;
> 
>  - init_node; this initializes the extra data members required for PV
>    state. PV state data is kept 1 cacheline ahead of the regular data.
> 
>  - link_and_wait_node/kick_node; these are paired with the regular MCS
>    queueing and are placed resp. before/after the paired MCS ops.
> 
>  - wait_head/queue_unlock; the interesting part here is finding the
>    head node to kick.
> 
> Tracking the head is done in two parts, firstly the pv_wait_head will
> store its cpu number in whichever node is pointed to by the tail part
> of the lock word. Secondly, pv_link_and_wait_node() will propagate the
> existing head from the old to the new tail node.

I dug in the code and I have some comments about it, but before
I post them I was wondering if you have any plans to run any performance
tests against the PV ticketlock with normal and over-committed scenarios?

Looking at this with a pen and paper I see that compared to
PV ticketlock for the CPUs that are contending on the queue (so they
go to pv_wait_head_and_link, then progress to pv_wait_head), they
go sleep twice and get woken up twice. In PV ticketlock the
contending CPUs would only go to sleep once and woken up once it
was their turn.

That of course is the worst case scenario - where the CPU
that has the lock is taking forever to do its job and the
host is quite overcommitted.

Thanks!

  parent reply	other threads:[~2014-06-20 13:46 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-15 12:46 [PATCH 00/11] qspinlock with paravirt support Peter Zijlstra
2014-06-15 12:46 ` Peter Zijlstra
2014-06-15 12:46 ` [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock Peter Zijlstra
2014-06-15 12:46   ` Peter Zijlstra
2014-06-16 20:49   ` Konrad Rzeszutek Wilk
2014-06-16 20:49     ` Konrad Rzeszutek Wilk
2014-06-17 20:03     ` Konrad Rzeszutek Wilk
2014-06-17 20:03       ` Konrad Rzeszutek Wilk
2014-06-23 16:12       ` Peter Zijlstra
2014-06-23 16:12         ` Peter Zijlstra
2014-06-23 16:20         ` Konrad Rzeszutek Wilk
2014-06-23 16:20           ` Konrad Rzeszutek Wilk
2014-06-23 15:56     ` Peter Zijlstra
2014-06-23 16:16       ` Konrad Rzeszutek Wilk
2014-06-23 16:16         ` Konrad Rzeszutek Wilk
2014-06-17 20:05   ` Konrad Rzeszutek Wilk
2014-06-17 20:05     ` Konrad Rzeszutek Wilk
2014-06-23 16:26     ` Peter Zijlstra
2014-06-23 16:26       ` Peter Zijlstra
2014-06-23 16:45       ` Konrad Rzeszutek Wilk
2014-06-23 16:45         ` Konrad Rzeszutek Wilk
2014-06-15 12:46 ` [PATCH 02/11] qspinlock, x86: Enable x86-64 to use " Peter Zijlstra
2014-06-15 12:46   ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 03/11] qspinlock: Add pending bit Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-17 20:36   ` Konrad Rzeszutek Wilk
2014-06-17 20:36     ` Konrad Rzeszutek Wilk
2014-06-17 20:51     ` Waiman Long
2014-06-17 20:51       ` Waiman Long
2014-06-17 21:07       ` Konrad Rzeszutek Wilk
2014-06-17 21:07         ` Konrad Rzeszutek Wilk
2014-06-17 21:10         ` Konrad Rzeszutek Wilk
2014-06-17 21:10           ` Konrad Rzeszutek Wilk
2014-06-17 22:25           ` Waiman Long
2014-06-17 22:25             ` Waiman Long
2014-06-24  8:24         ` Peter Zijlstra
2014-06-24  8:24           ` Peter Zijlstra
2014-06-18 11:29     ` Paolo Bonzini
2014-06-18 11:29       ` Paolo Bonzini
2014-06-18 13:36       ` Konrad Rzeszutek Wilk
2014-06-18 13:36         ` Konrad Rzeszutek Wilk
2014-06-23 16:35     ` Peter Zijlstra
2014-06-23 16:35       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 04/11] qspinlock: Extract out the exchange of tail code word Peter Zijlstra
2014-06-17 20:55   ` Konrad Rzeszutek Wilk
2014-06-17 20:55     ` Konrad Rzeszutek Wilk
2014-06-18 11:37     ` Paolo Bonzini
2014-06-18 11:37       ` Paolo Bonzini
2014-06-18 13:50       ` Konrad Rzeszutek Wilk
2014-06-18 13:50         ` Konrad Rzeszutek Wilk
2014-06-18 15:46         ` Waiman Long
2014-06-18 15:46           ` Waiman Long
2014-06-18 15:49           ` Paolo Bonzini
2014-06-18 15:49             ` Paolo Bonzini
2014-06-18 16:02           ` Konrad Rzeszutek Wilk
2014-06-18 16:02             ` Konrad Rzeszutek Wilk
2014-06-24 10:47       ` Peter Zijlstra
2014-06-24 10:47         ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 05/11] qspinlock: Optimize for smaller NR_CPUS Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 11:39   ` Paolo Bonzini
2014-06-18 11:39     ` Paolo Bonzini
2014-07-07 14:35     ` Peter Zijlstra
2014-07-07 14:35       ` Peter Zijlstra
2014-07-07 15:08       ` Paolo Bonzini
2014-07-07 15:08         ` Paolo Bonzini
2014-07-07 15:35         ` Peter Zijlstra
2014-07-07 15:35           ` Peter Zijlstra
2014-07-07 16:10           ` Paolo Bonzini
2014-07-07 16:10             ` Paolo Bonzini
2014-06-18 15:57   ` Konrad Rzeszutek Wilk
2014-06-18 15:57     ` Konrad Rzeszutek Wilk
2014-07-07 14:33     ` Peter Zijlstra
2014-07-07 14:33       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 06/11] qspinlock: Optimize pending bit Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 11:42   ` Paolo Bonzini
2014-06-18 11:42     ` Paolo Bonzini
2014-06-15 12:47 ` [PATCH 07/11] qspinlock: Use a simple write to grab the lock, if applicable Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 16:36   ` Konrad Rzeszutek Wilk
2014-06-18 16:36     ` Konrad Rzeszutek Wilk
2014-07-07 14:51     ` Peter Zijlstra
2014-07-07 14:51       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 08/11] qspinlock: Revert to test-and-set on hypervisors Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-16 21:57   ` Waiman Long
2014-06-18 16:40   ` Konrad Rzeszutek Wilk
2014-06-18 16:40     ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 09/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 16:43   ` Konrad Rzeszutek Wilk
2014-06-18 16:43     ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 10/11] qspinlock: Paravirt support Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-16 22:08   ` Waiman Long
2014-06-18 12:03     ` Paolo Bonzini
2014-06-18 12:03       ` Paolo Bonzini
2014-06-18 15:26       ` Waiman Long
2014-06-18 15:26         ` Waiman Long
2014-07-07 15:20       ` Peter Zijlstra
2014-07-07 15:20         ` Peter Zijlstra
2014-07-07 15:20     ` Peter Zijlstra
2014-07-07 15:20       ` Peter Zijlstra
2014-06-17  0:53   ` Waiman Long
2014-06-17  0:53     ` Waiman Long
2014-06-18 12:04   ` Paolo Bonzini
2014-06-18 12:04     ` Paolo Bonzini
2014-06-20 13:46   ` Konrad Rzeszutek Wilk [this message]
2014-06-20 13:46     ` Konrad Rzeszutek Wilk
2014-07-07 15:27     ` Peter Zijlstra
2014-07-15 14:23       ` Konrad Rzeszutek Wilk
2014-07-15 14:23         ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 11/11] qspinlock, kvm: Add paravirt support Peter Zijlstra
2014-06-22 16:36   ` Raghavendra K T
2014-06-22 16:36     ` Raghavendra K T
2014-07-07 15:23     ` Peter Zijlstra
2014-07-07 15:23       ` Peter Zijlstra
2014-06-16 20:52 ` [PATCH 00/11] qspinlock with " Konrad Rzeszutek Wilk
2014-06-16 20:52   ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140620134608.GA11545@laptop.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Waiman.Long@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=boris.ostrovsky@oracle.com \
    --cc=chegu_vinod@hp.com \
    --cc=david.vrabel@citrix.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paolo.bonzini@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).