Re: [PATCH tip/locking/core v9 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hpe.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>,
	Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [PATCH tip/locking/core v9 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt
Date: Fri, 06 Nov 2015 12:47:49 -0500	[thread overview]
Message-ID: <563CE7C5.9020508@hpe.com> (raw)
In-Reply-To: <20151106145005.GU17308@twins.programming.kicks-ass.net>

On 11/06/2015 09:50 AM, Peter Zijlstra wrote:
> On Fri, Oct 30, 2015 at 07:26:36PM -0400, Waiman Long wrote:
>
>> @@ -431,35 +432,44 @@ queue:
>>   	 * sequentiality; this is because the set_locked() function below
>>   	 * does not imply a full barrier.
>>   	 *
>> +	 * The PV pv_wait_head_lock function, if active, will acquire the lock
>> +	 * and return a non-zero value. So we have to skip the
>> +	 * smp_load_acquire() call. As the next PV queue head hasn't been
>> +	 * designated yet, there is no way for the locked value to become
>> +	 * _Q_SLOW_VAL. So both the redundant set_locked() and the
>> +	 * atomic_cmpxchg_relaxed() calls will be safe. The cost of the
>> +	 * redundant set_locked() call below should be negligible, too.
>> +	 *
>> +	 * If PV isn't active, 0 will be returned instead.
>>   	 */
>> -	pv_wait_head(lock, node);
>> -	while ((val = smp_load_acquire(&lock->val.counter))&  _Q_LOCKED_PENDING_MASK)
>> -		cpu_relax();
>> +	val = pv_wait_head_lock(lock, node);
>> +	if (!val) {
>> +		while ((val = smp_load_acquire(&lock->val.counter))
>> +				&  _Q_LOCKED_PENDING_MASK)
>> +			cpu_relax();
>> +		/*
>> +		 * Claim the lock now:
>> +		 *
>> +		 * 0,0 ->  0,1
>> +		 */
>> +		set_locked(lock);
>> +		val |= _Q_LOCKED_VAL;
>> +	}
>>
>>   	/*
>>   	 * If the next pointer is defined, we are not tail anymore.
>> -	 * In this case, claim the spinlock&  release the MCS lock.
>>   	 */
>> -	if (next) {
>> -		set_locked(lock);
>> +	if (next)
>>   		goto mcs_unlock;
>> -	}
>>
>>   	/*
>> -	 * claim the lock:
>> -	 *
>> -	 * n,0,0 ->  0,0,1 : lock, uncontended
>> -	 * *,0,0 ->  *,0,1 : lock, contended
>> -	 *
>>   	 * If the queue head is the only one in the queue (lock value == tail),
>> -	 * clear the tail code and grab the lock. Otherwise, we only need
>> -	 * to grab the lock.
>> +	 * we have to clear the tail code.
>>   	 */
>>   	for (;;) {
>> -		if (val != tail) {
>> -			set_locked(lock);
>> +		if ((val&  _Q_TAIL_MASK) != tail)
>>   			break;
>> -		}
>> +
>>   		/*
>>   		 * The smp_load_acquire() call above has provided the necessary
>>   		 * acquire semantics required for locking. At most two
> *urgh*, last time we had:
>
> +	if (pv_wait_head_or_steal())
> +		goto stolen;
> 	while ((val = smp_load_acquire(&lock->val.counter))&  _Q_LOCKED_PENDING_MASK)
> 		cpu_relax();
>
> 	...
>
> +stolen:
> 	while (!(next = READ_ONCE(node->next)))
> 		cpu_relax();
>
> 	...
>
> Now you completely overhaul the native code.. what happened?

I want to reuse as much of the existing native code as possible instead 
of duplicating that in the PV function. The only difference now is that 
the PV function will acquire that lock. Semantically, I don't want to 
call the lock acquisition as lock stealing as the queue head is entitled 
to get the lock next. I can rename pv_queued_spin_trylock_unfair() to 
pv_queued_spin_steal_lock() to emphasize the fact that this is the 
routine where lock stealing happens.

>> -static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
>> +static u32 pv_wait_head_lock(struct qspinlock *lock, struct mcs_spinlock *node)
>>   {
>>   	struct pv_node *pn = (struct pv_node *)node;
>>   	struct __qspinlock *l = (void *)lock;
>> @@ -276,11 +330,24 @@ static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
>>   		lp = (struct qspinlock **)1;
>>
>>   	for (;; waitcnt++) {
>> +		/*
>> +		 * Set the pending bit in the active lock spinning loop to
>> +		 * disable lock stealing. However, the pending bit check in
>> +		 * pv_queued_spin_trylock_unfair() and the setting/clearing
>> +		 * of pending bit here aren't memory barriers. So a cmpxchg()
>> +		 * is used to acquire the lock to be sure.
>> +		 */
>> +		set_pending(lock);
> OK, so we mark ourselves 'pending' such that a new lock() will not steal
> and is forced to queue behind us.

Yes, this ensures that lock starvation will not happens.

>
>>   		for (loop = SPIN_THRESHOLD; loop; loop--) {
>> -			if (!READ_ONCE(l->locked))
>> -				return;
>> +			if (!READ_ONCE(l->locked)&&
>> +			   (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) {
>> +				clear_pending(lock);
>> +				goto gotlock;
> Would not: cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL),
> make sense to avoid the clear_pending() call?

I can combine cmpxchg() and clear_pending() into a new helper function 
as its implementation will differ depends on NR_CPUS.

Cheers,
Longman

next prev parent reply	other threads:[~2015-11-06 17:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-30 23:26 [PATCH tip/locking/core v9 0/6] locking/qspinlock: Enhance pvqspinlock Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 1/6] locking/qspinlock: Use _acquire/_release versions of cmpxchg & xchg Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 2/6] locking/qspinlock: prefetch next node cacheline Waiman Long
2015-11-02 16:36   ` Peter Zijlstra
2015-11-02 22:54     ` Peter Zijlstra
2015-11-05 16:42       ` Waiman Long
2015-11-05 16:49         ` Peter Zijlstra
2015-11-05 16:06     ` Waiman Long
2015-11-05 16:39       ` Peter Zijlstra
2015-11-05 16:52         ` Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 3/6] locking/pvqspinlock, x86: Optimize PV unlock code path Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 4/6] locking/pvqspinlock: Collect slowpath lock statistics Waiman Long
2015-11-02 16:40   ` Peter Zijlstra
2015-11-05 16:29     ` Waiman Long
2015-11-05 16:43       ` Peter Zijlstra
2015-11-05 16:59         ` Waiman Long
2015-11-05 17:09           ` Peter Zijlstra
2015-11-05 17:34             ` Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt Waiman Long
2015-11-06 14:50   ` Peter Zijlstra
2015-11-06 17:47     ` Waiman Long [this message]
2015-11-09 17:29       ` Peter Zijlstra
2015-11-09 19:53         ` Waiman Long
2015-10-30 23:26 ` [PATCH tip/locking/core v9 6/6] locking/pvqspinlock: Queue node adaptive spinning Waiman Long
2015-11-06 15:01   ` Peter Zijlstra
2015-11-06 17:54     ` Waiman Long
2015-11-06 20:37       ` Peter Zijlstra
2015-11-09 16:51         ` Waiman Long
2015-11-09 17:33           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563CE7C5.9020508@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=dave@stgolabs.net \
    --cc=doug.hatch@hpe.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hpe.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.