From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH RESEND v4] locking/pvqspinlock: Fix double hash race Date: Tue, 9 Aug 2016 12:49:02 +0200 Message-ID: <20160809104902.GH30192@twins.programming.kicks-ass.net> References: <1470735467-4370-1-git-send-email-wanpeng.li@hotmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Wanpeng Li , Ingo Molnar , Waiman Long , Davidlohr Bueso To: Wanpeng Li Return-path: Content-Disposition: inline In-Reply-To: <1470735467-4370-1-git-send-email-wanpeng.li@hotmail.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Tue, Aug 09, 2016 at 05:37:47PM +0800, Wanpeng Li wrote: > From: Wanpeng Li > > When the lock holder vCPU is racing with the queue head vCPU: > > lock holder vCPU queue head vCPU > ===================== ================== > > node->locked = 1; > READ_ONCE(node->locked) > ... pv_wait_head_or_lock(): > SPIN_THRESHOLD loop; > pv_hash(); > lock->locked = _Q_SLOW_VAL; > node->state = vcpu_hashed; > pv_kick_node(): > cmpxchg(node->state, > vcpu_halted, vcpu_hashed); > lock->locked = _Q_SLOW_VAL; > pv_hash(); So here the example is 'wrong' in that it doesn't illustrate the fail case, namely having vcpu_halted win while we're hashed. > +++ b/kernel/locking/qspinlock_paravirt.h > @@ -450,7 +450,28 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > goto gotlock; > } > } > - WRITE_ONCE(pn->state, vcpu_halted); > + /* > + * lock holder vCPU queue head vCPU > + * ---------------- --------------- > + * node->locked = 1; > + * READ_ONCE(node->locked) > + * ... pv_wait_head_or_lock(): > + * SPIN_THRESHOLD loop; > + * pv_hash(); > + * lock->locked = _Q_SLOW_VAL; > + * node->state = vcpu_hashed; > + * pv_kick_node(): > + * cmpxchg(node->state, > + * vcpu_halted, vcpu_hashed); > + * lock->locked = _Q_SLOW_VAL; > + * pv_hash(); > + * > + * With preemption at the right moment, it is possible that both the > + * lock holder and queue head vCPUs can be racing to set node->state. > + * Making sure the state is never set to vcpu_halted will prevent this > + * racing from happening. > + */ > + WRITE_ONCE(pn->state, vcpu_hashed); > qstat_inc(qstat_pv_wait_head, true); > qstat_inc(qstat_pv_wait_again, waitcnt); > pv_wait(&l->locked, _Q_SLOW_VAL); And I completely fail to see the point of this comment, still. Yes, if we would have used vcpu_halted, there would be a problem, but we don't so there isn't. What does this comment tell us about the current code? In any case, I have an older version (possibly v1) queued up. That fixes the bug.