Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hp.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	virtualization@lists.linux-foundation.org,
	Andi Kleen <andi@firstfloor.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Michel Lespinasse <walken@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arch@vger.kernel.org, Gleb Natapov <gleb@redhat.com>,
	x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	xen-devel@lists.xenproject.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Scott J Norton <scott.norton@hp.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Chris Wright <chrisw@sous-sol.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Alok Kataria <akataria@vmware.com>,
	Aswin
Subject: Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Date: Thu, 13 Mar 2014 15:49:16 -0400	[thread overview]
Message-ID: <53220BBC.4040608@hp.com> (raw)
In-Reply-To: <5321B959.5050305@redhat.com>

On 03/13/2014 09:57 AM, Paolo Bonzini wrote:
> Il 13/03/2014 12:21, David Vrabel ha scritto:
>> On 12/03/14 18:54, Waiman Long wrote:
>>> This patch adds para-virtualization support to the queue spinlock in
>>> the same way as was done in the PV ticket lock code. In essence, the
>>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD
>>> = 2^14) and then halted itself. The queue head waiter will spins
>>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned
>>> QSPIN_THRESHOLD times, the queue head will assume that the lock
>>> holder may be scheduled out and attempt to kick the lock holder CPU
>>> if it has the CPU number on hand.
>>
>> I don't really understand the reasoning for kicking the lock holder.
>
> I agree.  If the lock holder isn't running, there's probably a good 
> reason for that and going to sleep will not necessarily convince the 
> scheduler to give more CPU to the lock holder.  I think there are two 
> choices:
>
> 1) use yield_to to donate part of the waiter's quantum to the lock 
> holder?    For this we probably need a new, separate hypercall 
> interface.  For KVM it would be the same as hlt in the guest but with 
> an additional yield_to in the host.
>
> 2) do nothing, just go to sleep.
>
> Could you get (or do you have) numbers for (2)?

I will take out the lock holder kick portion from the patch. I will also 
try to collect more test data.

>
> More important, I think a barrier is missing:
>
>     Lock holder ---------------------------------------
>
>     // queue_spin_unlock
>     barrier();
>     ACCESS_ONCE(qlock->lock) = 0;
>     barrier();
>

This is not the unlock code that is used when PV spinlock is enabled. 
The right unlock code is

         if (static_key_false(&paravirt_spinlocks_enabled)) {
                 /*
                  * Need to atomically clear the lock byte to avoid 
racing with
                  * queue head waiter trying to set 
_QSPINLOCK_LOCKED_SLOWPATH.
                  */
                 if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0)
                                 == _QSPINLOCK_LOCKED))
                         return;
                 else
                         queue_spin_unlock_slowpath(lock);

         } else {
                 __queue_spin_unlock(lock);
         }

>     // pv_kick_node:
>     if (pv->cpustate != PV_CPU_HALTED)
>         return;
>     ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED;
>     __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD);
>
>         Waiter -------------------------------------------
>
>         // pv_head_spin_check
>         ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED;
>         lockval = cmpxchg(&qlock->lock,
>                   _QSPINLOCK_LOCKED,
>                   _QSPINLOCK_LOCKED_SLOWPATH);
>         if (lockval == 0) {
>             /*
>              * Can exit now as the lock is free
>              */
>             ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE;
>             *count = 0;
>             return;
>         }
>         __queue_hibernate();
>
> Nothing protects from writing qlock->lock before pv->cpustate is read, 
> leading to this:
>
>     Lock holder            Waiter
>     ---------------------------------------------------------------
>     read pv->cpustate
>         (it is PV_CPU_ACTIVE)
>                     pv->cpustate = PV_CPU_HALTED
>                     lockval = cmpxchg(...)
>                     hibernate()
>     qlock->lock = 0
>     if (pv->cpustate != PV_CPU_HALTED)
>         return;
>

The lock holder will read cpustate only if the lock byte has been 
changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte 
synchronize the 2 threads. The only thing that I am not certain is when 
the waiter is trying to go to sleep while, at the same time, the lock 
holder is trying to kick it. Will there be a missed wakeup because of 
this timing issue?

-Longman

WARNING: multiple messages have this Message-ID (diff)

From: Waiman Long <waiman.long@hp.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	virtualization@lists.linux-foundation.org,
	Andi Kleen <andi@firstfloor.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Michel Lespinasse <walken@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arch@vger.kernel.org, Gleb Natapov <gleb@redhat.com>,
	x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	xen-devel@lists.xenproject.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Scott J Norton <scott.norton@hp.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Chris Wright <chrisw@sous-sol.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Alok Kataria <akataria@vmware.com>,
	Aswin Chandramouleeswaran <aswin@hp.com>,
	Chegu Vinod <chegu_vinod@hp.com>
Subject: Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Date: Thu, 13 Mar 2014 15:49:16 -0400	[thread overview]
Message-ID: <53220BBC.4040608@hp.com> (raw)
Message-ID: <20140313194916.0uunWZrlzvljHb43FrxuqfLBGrYn_MViEJN83E7se2E@z> (raw)
In-Reply-To: <5321B959.5050305@redhat.com>

On 03/13/2014 09:57 AM, Paolo Bonzini wrote:
> Il 13/03/2014 12:21, David Vrabel ha scritto:
>> On 12/03/14 18:54, Waiman Long wrote:
>>> This patch adds para-virtualization support to the queue spinlock in
>>> the same way as was done in the PV ticket lock code. In essence, the
>>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD
>>> = 2^14) and then halted itself. The queue head waiter will spins
>>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned
>>> QSPIN_THRESHOLD times, the queue head will assume that the lock
>>> holder may be scheduled out and attempt to kick the lock holder CPU
>>> if it has the CPU number on hand.
>>
>> I don't really understand the reasoning for kicking the lock holder.
>
> I agree.  If the lock holder isn't running, there's probably a good 
> reason for that and going to sleep will not necessarily convince the 
> scheduler to give more CPU to the lock holder.  I think there are two 
> choices:
>
> 1) use yield_to to donate part of the waiter's quantum to the lock 
> holder?    For this we probably need a new, separate hypercall 
> interface.  For KVM it would be the same as hlt in the guest but with 
> an additional yield_to in the host.
>
> 2) do nothing, just go to sleep.
>
> Could you get (or do you have) numbers for (2)?

I will take out the lock holder kick portion from the patch. I will also 
try to collect more test data.

>
> More important, I think a barrier is missing:
>
>     Lock holder ---------------------------------------
>
>     // queue_spin_unlock
>     barrier();
>     ACCESS_ONCE(qlock->lock) = 0;
>     barrier();
>

This is not the unlock code that is used when PV spinlock is enabled. 
The right unlock code is

         if (static_key_false(&paravirt_spinlocks_enabled)) {
                 /*
                  * Need to atomically clear the lock byte to avoid 
racing with
                  * queue head waiter trying to set 
_QSPINLOCK_LOCKED_SLOWPATH.
                  */
                 if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0)
                                 == _QSPINLOCK_LOCKED))
                         return;
                 else
                         queue_spin_unlock_slowpath(lock);

         } else {
                 __queue_spin_unlock(lock);
         }

>     // pv_kick_node:
>     if (pv->cpustate != PV_CPU_HALTED)
>         return;
>     ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED;
>     __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD);
>
>         Waiter -------------------------------------------
>
>         // pv_head_spin_check
>         ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED;
>         lockval = cmpxchg(&qlock->lock,
>                   _QSPINLOCK_LOCKED,
>                   _QSPINLOCK_LOCKED_SLOWPATH);
>         if (lockval == 0) {
>             /*
>              * Can exit now as the lock is free
>              */
>             ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE;
>             *count = 0;
>             return;
>         }
>         __queue_hibernate();
>
> Nothing protects from writing qlock->lock before pv->cpustate is read, 
> leading to this:
>
>     Lock holder            Waiter
>     ---------------------------------------------------------------
>     read pv->cpustate
>         (it is PV_CPU_ACTIVE)
>                     pv->cpustate = PV_CPU_HALTED
>                     lockval = cmpxchg(...)
>                     hibernate()
>     qlock->lock = 0
>     if (pv->cpustate != PV_CPU_HALTED)
>         return;
>

The lock holder will read cpustate only if the lock byte has been 
changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte 
synchronize the 2 threads. The only thing that I am not certain is when 
the waiter is trying to go to sleep while, at the same time, the lock 
holder is trying to kick it. Will there be a missed wakeup because of 
this timing issue?

-Longman

next prev parent reply	other threads:[~2014-03-13 19:49 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-12 18:54 [PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support Waiman Long
2014-03-12 18:54 ` [PATCH v6 01/11] qspinlock: A generic 4-byte queue spinlock implementation Waiman Long
2014-03-12 18:54 ` [PATCH v6 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock Waiman Long
2014-03-12 18:54 ` [PATCH v6 03/11] qspinlock: More optimized code for smaller NR_CPUS Waiman Long
2014-03-12 18:54 ` [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks Waiman Long
2014-03-12 19:08   ` Waiman Long
2014-03-13 13:57     ` Peter Zijlstra
2014-03-17 17:23       ` Waiman Long
2014-03-12 18:54 ` [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest Waiman Long
2014-03-13 10:54   ` David Vrabel
2014-03-13 13:16     ` Paolo Bonzini
2014-03-13 13:16       ` Paolo Bonzini
2014-03-17 19:05       ` Konrad Rzeszutek Wilk
2014-03-17 19:05         ` Konrad Rzeszutek Wilk
2014-03-18  8:14         ` Paolo Bonzini
2014-03-18  8:14           ` Paolo Bonzini
2014-03-19  3:15           ` Waiman Long
2014-03-19  3:15             ` Waiman Long
2014-03-19 10:07             ` Paolo Bonzini
2014-03-19 16:58               ` Waiman Long
2014-03-19 17:08                 ` Paolo Bonzini
2014-03-19 17:08                   ` Paolo Bonzini
2014-03-13 19:03     ` Waiman Long
2014-03-13 15:15   ` Peter Zijlstra
2014-03-13 20:05     ` Waiman Long
2014-03-14  8:30       ` Peter Zijlstra
2014-03-14  8:48         ` Paolo Bonzini
2014-03-14  8:48           ` Paolo Bonzini
2014-03-17 17:44         ` Waiman Long
2014-03-17 18:54           ` Peter Zijlstra
2014-03-18  8:16             ` Paolo Bonzini
2014-03-18  8:16               ` Paolo Bonzini
2014-03-19  3:08             ` Waiman Long
2014-03-17 19:10           ` Konrad Rzeszutek Wilk
2014-03-19  3:11             ` Waiman Long
2014-03-19 15:25               ` Konrad Rzeszutek Wilk
2014-03-12 18:54 ` [PATCH v6 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest Waiman Long
2014-03-12 18:54 ` [PATCH v6 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest Waiman Long
2014-03-12 18:54 ` [PATCH v6 08/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Waiman Long
2014-03-12 18:54 ` [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support Waiman Long
2014-03-13 11:21   ` David Vrabel
2014-03-13 13:57     ` Paolo Bonzini
2014-03-13 13:57       ` Paolo Bonzini
2014-03-13 19:49       ` Waiman Long [this message]
2014-03-13 19:49         ` Waiman Long
2014-03-14  9:44         ` Paolo Bonzini
2014-03-14  9:44           ` Paolo Bonzini
2014-03-13 19:05     ` Waiman Long
2014-03-12 18:54 ` [PATCH RFC v6 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM Waiman Long
2014-03-13 13:59   ` Paolo Bonzini
2014-03-13 13:59     ` Paolo Bonzini
2014-03-13 19:13     ` Waiman Long
2014-03-14  8:42       ` Paolo Bonzini
2014-03-17 17:47         ` Waiman Long
2014-03-17 17:47           ` Waiman Long
2014-03-18  8:18           ` Paolo Bonzini
2014-03-13 15:25   ` Peter Zijlstra
2014-03-13 20:09     ` Waiman Long
2014-03-12 18:54 ` [PATCH RFC v6 11/11] pvqspinlock, x86: Enable qspinlock PV support for XEN Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53220BBC.4040608@hp.com \
    --to=waiman.long@hp.com \
    --cc=akataria@vmware.com \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=chrisw@sous-sol.org \
    --cc=david.vrabel@citrix.com \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).