From: Waiman Long <waiman.long@hp.com> To: Paolo Bonzini <pbonzini@redhat.com> Cc: David Vrabel <david.vrabel@citrix.com>, Jeremy Fitzhardinge <jeremy@goop.org>, Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>, kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, virtualization@lists.linux-foundation.org, Andi Kleen <andi@firstfloor.org>, "H. Peter Anvin" <hpa@zytor.com>, Michel Lespinasse <walken@google.com>, Thomas Gleixner <tglx@linutronix.de>, linux-arch@vger.kernel.org, Gleb Natapov <gleb@redhat.com>, x86@kernel.org, Ingo Molnar <mingo@redhat.com>, xen-devel@lists.xenproject.org, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Rik van Riel <riel@redhat.com>, Arnd Bergmann <arnd@arndb.de>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Scott J Norton <scott.norton@hp.com>, Steven Rostedt <rostedt@goodmis.org>, Chris Wright <chrisw@sous-sol.org>, Oleg Nesterov <oleg@redhat.com>, Alok Kataria <akataria@vmware.com>, Aswin Subject: Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support Date: Thu, 13 Mar 2014 15:49:16 -0400 [thread overview] Message-ID: <53220BBC.4040608@hp.com> (raw) In-Reply-To: <5321B959.5050305@redhat.com> On 03/13/2014 09:57 AM, Paolo Bonzini wrote: > Il 13/03/2014 12:21, David Vrabel ha scritto: >> On 12/03/14 18:54, Waiman Long wrote: >>> This patch adds para-virtualization support to the queue spinlock in >>> the same way as was done in the PV ticket lock code. In essence, the >>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD >>> = 2^14) and then halted itself. The queue head waiter will spins >>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned >>> QSPIN_THRESHOLD times, the queue head will assume that the lock >>> holder may be scheduled out and attempt to kick the lock holder CPU >>> if it has the CPU number on hand. >> >> I don't really understand the reasoning for kicking the lock holder. > > I agree. If the lock holder isn't running, there's probably a good > reason for that and going to sleep will not necessarily convince the > scheduler to give more CPU to the lock holder. I think there are two > choices: > > 1) use yield_to to donate part of the waiter's quantum to the lock > holder? For this we probably need a new, separate hypercall > interface. For KVM it would be the same as hlt in the guest but with > an additional yield_to in the host. > > 2) do nothing, just go to sleep. > > Could you get (or do you have) numbers for (2)? I will take out the lock holder kick portion from the patch. I will also try to collect more test data. > > More important, I think a barrier is missing: > > Lock holder --------------------------------------- > > // queue_spin_unlock > barrier(); > ACCESS_ONCE(qlock->lock) = 0; > barrier(); > This is not the unlock code that is used when PV spinlock is enabled. The right unlock code is if (static_key_false(¶virt_spinlocks_enabled)) { /* * Need to atomically clear the lock byte to avoid racing with * queue head waiter trying to set _QSPINLOCK_LOCKED_SLOWPATH. */ if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0) == _QSPINLOCK_LOCKED)) return; else queue_spin_unlock_slowpath(lock); } else { __queue_spin_unlock(lock); } > // pv_kick_node: > if (pv->cpustate != PV_CPU_HALTED) > return; > ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED; > __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD); > > Waiter ------------------------------------------- > > // pv_head_spin_check > ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED; > lockval = cmpxchg(&qlock->lock, > _QSPINLOCK_LOCKED, > _QSPINLOCK_LOCKED_SLOWPATH); > if (lockval == 0) { > /* > * Can exit now as the lock is free > */ > ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE; > *count = 0; > return; > } > __queue_hibernate(); > > Nothing protects from writing qlock->lock before pv->cpustate is read, > leading to this: > > Lock holder Waiter > --------------------------------------------------------------- > read pv->cpustate > (it is PV_CPU_ACTIVE) > pv->cpustate = PV_CPU_HALTED > lockval = cmpxchg(...) > hibernate() > qlock->lock = 0 > if (pv->cpustate != PV_CPU_HALTED) > return; > The lock holder will read cpustate only if the lock byte has been changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte synchronize the 2 threads. The only thing that I am not certain is when the waiter is trying to go to sleep while, at the same time, the lock holder is trying to kick it. Will there be a missed wakeup because of this timing issue? -Longman
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com> To: Paolo Bonzini <pbonzini@redhat.com> Cc: David Vrabel <david.vrabel@citrix.com>, Jeremy Fitzhardinge <jeremy@goop.org>, Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>, kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, virtualization@lists.linux-foundation.org, Andi Kleen <andi@firstfloor.org>, "H. Peter Anvin" <hpa@zytor.com>, Michel Lespinasse <walken@google.com>, Thomas Gleixner <tglx@linutronix.de>, linux-arch@vger.kernel.org, Gleb Natapov <gleb@redhat.com>, x86@kernel.org, Ingo Molnar <mingo@redhat.com>, xen-devel@lists.xenproject.org, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Rik van Riel <riel@redhat.com>, Arnd Bergmann <arnd@arndb.de>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Scott J Norton <scott.norton@hp.com>, Steven Rostedt <rostedt@goodmis.org>, Chris Wright <chrisw@sous-sol.org>, Oleg Nesterov <oleg@redhat.com>, Alok Kataria <akataria@vmware.com>, Aswin Chandramouleeswaran <aswin@hp.com>, Chegu Vinod <chegu_vinod@hp.com> Subject: Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support Date: Thu, 13 Mar 2014 15:49:16 -0400 [thread overview] Message-ID: <53220BBC.4040608@hp.com> (raw) Message-ID: <20140313194916.0uunWZrlzvljHb43FrxuqfLBGrYn_MViEJN83E7se2E@z> (raw) In-Reply-To: <5321B959.5050305@redhat.com> On 03/13/2014 09:57 AM, Paolo Bonzini wrote: > Il 13/03/2014 12:21, David Vrabel ha scritto: >> On 12/03/14 18:54, Waiman Long wrote: >>> This patch adds para-virtualization support to the queue spinlock in >>> the same way as was done in the PV ticket lock code. In essence, the >>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD >>> = 2^14) and then halted itself. The queue head waiter will spins >>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned >>> QSPIN_THRESHOLD times, the queue head will assume that the lock >>> holder may be scheduled out and attempt to kick the lock holder CPU >>> if it has the CPU number on hand. >> >> I don't really understand the reasoning for kicking the lock holder. > > I agree. If the lock holder isn't running, there's probably a good > reason for that and going to sleep will not necessarily convince the > scheduler to give more CPU to the lock holder. I think there are two > choices: > > 1) use yield_to to donate part of the waiter's quantum to the lock > holder? For this we probably need a new, separate hypercall > interface. For KVM it would be the same as hlt in the guest but with > an additional yield_to in the host. > > 2) do nothing, just go to sleep. > > Could you get (or do you have) numbers for (2)? I will take out the lock holder kick portion from the patch. I will also try to collect more test data. > > More important, I think a barrier is missing: > > Lock holder --------------------------------------- > > // queue_spin_unlock > barrier(); > ACCESS_ONCE(qlock->lock) = 0; > barrier(); > This is not the unlock code that is used when PV spinlock is enabled. The right unlock code is if (static_key_false(¶virt_spinlocks_enabled)) { /* * Need to atomically clear the lock byte to avoid racing with * queue head waiter trying to set _QSPINLOCK_LOCKED_SLOWPATH. */ if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0) == _QSPINLOCK_LOCKED)) return; else queue_spin_unlock_slowpath(lock); } else { __queue_spin_unlock(lock); } > // pv_kick_node: > if (pv->cpustate != PV_CPU_HALTED) > return; > ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED; > __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD); > > Waiter ------------------------------------------- > > // pv_head_spin_check > ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED; > lockval = cmpxchg(&qlock->lock, > _QSPINLOCK_LOCKED, > _QSPINLOCK_LOCKED_SLOWPATH); > if (lockval == 0) { > /* > * Can exit now as the lock is free > */ > ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE; > *count = 0; > return; > } > __queue_hibernate(); > > Nothing protects from writing qlock->lock before pv->cpustate is read, > leading to this: > > Lock holder Waiter > --------------------------------------------------------------- > read pv->cpustate > (it is PV_CPU_ACTIVE) > pv->cpustate = PV_CPU_HALTED > lockval = cmpxchg(...) > hibernate() > qlock->lock = 0 > if (pv->cpustate != PV_CPU_HALTED) > return; > The lock holder will read cpustate only if the lock byte has been changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte synchronize the 2 threads. The only thing that I am not certain is when the waiter is trying to go to sleep while, at the same time, the lock holder is trying to kick it. Will there be a missed wakeup because of this timing issue? -Longman
next prev parent reply other threads:[~2014-03-13 19:49 UTC|newest] Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-03-12 18:54 [PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support Waiman Long 2014-03-12 18:54 ` [PATCH v6 01/11] qspinlock: A generic 4-byte queue spinlock implementation Waiman Long 2014-03-12 18:54 ` [PATCH v6 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock Waiman Long 2014-03-12 18:54 ` [PATCH v6 03/11] qspinlock: More optimized code for smaller NR_CPUS Waiman Long 2014-03-12 18:54 ` [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks Waiman Long 2014-03-12 19:08 ` Waiman Long 2014-03-13 13:57 ` Peter Zijlstra 2014-03-17 17:23 ` Waiman Long 2014-03-12 18:54 ` [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest Waiman Long 2014-03-13 10:54 ` David Vrabel 2014-03-13 13:16 ` Paolo Bonzini 2014-03-13 13:16 ` Paolo Bonzini 2014-03-17 19:05 ` Konrad Rzeszutek Wilk 2014-03-17 19:05 ` Konrad Rzeszutek Wilk 2014-03-18 8:14 ` Paolo Bonzini 2014-03-18 8:14 ` Paolo Bonzini 2014-03-19 3:15 ` Waiman Long 2014-03-19 3:15 ` Waiman Long 2014-03-19 10:07 ` Paolo Bonzini 2014-03-19 16:58 ` Waiman Long 2014-03-19 17:08 ` Paolo Bonzini 2014-03-19 17:08 ` Paolo Bonzini 2014-03-13 19:03 ` Waiman Long 2014-03-13 15:15 ` Peter Zijlstra 2014-03-13 20:05 ` Waiman Long 2014-03-14 8:30 ` Peter Zijlstra 2014-03-14 8:48 ` Paolo Bonzini 2014-03-14 8:48 ` Paolo Bonzini 2014-03-17 17:44 ` Waiman Long 2014-03-17 18:54 ` Peter Zijlstra 2014-03-18 8:16 ` Paolo Bonzini 2014-03-18 8:16 ` Paolo Bonzini 2014-03-19 3:08 ` Waiman Long 2014-03-17 19:10 ` Konrad Rzeszutek Wilk 2014-03-19 3:11 ` Waiman Long 2014-03-19 15:25 ` Konrad Rzeszutek Wilk 2014-03-12 18:54 ` [PATCH v6 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest Waiman Long 2014-03-12 18:54 ` [PATCH v6 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest Waiman Long 2014-03-12 18:54 ` [PATCH v6 08/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Waiman Long 2014-03-12 18:54 ` [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support Waiman Long 2014-03-13 11:21 ` David Vrabel 2014-03-13 13:57 ` Paolo Bonzini 2014-03-13 13:57 ` Paolo Bonzini 2014-03-13 19:49 ` Waiman Long [this message] 2014-03-13 19:49 ` Waiman Long 2014-03-14 9:44 ` Paolo Bonzini 2014-03-14 9:44 ` Paolo Bonzini 2014-03-13 19:05 ` Waiman Long 2014-03-12 18:54 ` [PATCH RFC v6 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM Waiman Long 2014-03-13 13:59 ` Paolo Bonzini 2014-03-13 13:59 ` Paolo Bonzini 2014-03-13 19:13 ` Waiman Long 2014-03-14 8:42 ` Paolo Bonzini 2014-03-17 17:47 ` Waiman Long 2014-03-17 17:47 ` Waiman Long 2014-03-18 8:18 ` Paolo Bonzini 2014-03-13 15:25 ` Peter Zijlstra 2014-03-13 20:09 ` Waiman Long 2014-03-12 18:54 ` [PATCH RFC v6 11/11] pvqspinlock, x86: Enable qspinlock PV support for XEN Waiman Long
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=53220BBC.4040608@hp.com \ --to=waiman.long@hp.com \ --cc=akataria@vmware.com \ --cc=andi@firstfloor.org \ --cc=arnd@arndb.de \ --cc=chrisw@sous-sol.org \ --cc=david.vrabel@citrix.com \ --cc=gleb@redhat.com \ --cc=hpa@zytor.com \ --cc=jeremy@goop.org \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=oleg@redhat.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=pbonzini@redhat.com \ --cc=peterz@infradead.org \ --cc=raghavendra.kt@linux.vnet.ibm.com \ --cc=riel@redhat.com \ --cc=rostedt@goodmis.org \ --cc=scott.norton@hp.com \ --cc=tglx@linutronix.de \ --cc=virtualization@lists.linux-foundation.org \ --cc=walken@google.com \ --cc=x86@kernel.org \ --cc=xen-devel@lists.xenproject.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).