public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hpe.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
	Pan Xinhui <xinhui@linux.vnet.ibm.com>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait()
Date: Thu, 28 Apr 2016 16:50:25 -0400	[thread overview]
Message-ID: <57227791.2020908@hpe.com> (raw)
In-Reply-To: <20160428133213.GC25392@insomnia>

On 04/28/2016 09:32 AM, Boqun Feng wrote:
> Hi Waiman,
>
> On Thu, Apr 21, 2016 at 11:02:05AM -0400, Waiman Long wrote:
>> Pan Xinhui was asking for a lock holder cpu argument in pv_wait()
>> to help the porting of pvqspinlock to PPC. The new argument will can
>> help hypervisor expediate the execution of the critical section by
>> the lock holder, if its vCPU isn't running, so that it can release
>> the lock sooner.
>>
>> The pv_wait() function is now extended to have a third argument
>> that holds the vCPU number of the lock holder, if this is
>> known. Architecture specific hypervisor code can then make use of
>> that information to make better decision of which vCPU should be
>> running next.
>>
>> This patch also adds a new field in the pv_node structure to hold
>> the vCPU number of the previous queue entry. For the queue head,
>> the prev_cpu entry is likely to be the that of the lock holder,
>> though lock stealing may render this information inaccurate.
>>
>> In pv_wait_head_or_lock(), pv_wait() will now be called with that
>> vCPU number as it is likely to be the lock holder.
>>
>> In pv_wait_node(), the newly added pv_lookup_hash() function will
>> be called to look up the queue head and pass in the lock holder vCPU
>> number stored there, if found.
>>
>> Suggested-by:  Pan Xinhui<xinhui@linux.vnet.ibm.com>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>>
>>   v2->v3:
>>    - Rephrase the changelog and some of the comments to make the
>>      intention of this patch more clear.
>>    - Make pv_lookup_hash() architecture dependent to eliminate its cost
>>      if an architecture doesn't need it. Also make the number of
>>      cachelines searched in pv_lookup_hash() to between 1-4.
>>    - [RESEND] Fix typo error.
>>
>>   v1->v2:
>>    - In pv_wait_node(), lookup the hash table for the queue head and pass
>>      the lock holder cpu number to pv_wait().
>>
> [snip]
>
>> @@ -282,7 +328,8 @@ static void pv_init_node(struct mcs_spinlock *node)
>>    * pv_kick_node() is used to set _Q_SLOW_VAL and fill in hash table on its
>>    * behalf.
>>    */
>> -static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>> +static void pv_wait_node(struct qspinlock *lock, struct mcs_spinlock *node,
>> +			 struct mcs_spinlock *prev)
>>   {
>>   	struct pv_node *pn = (struct pv_node *)node;
>>   	struct pv_node *pp = (struct pv_node *)prev;
>> @@ -290,6 +337,8 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>>   	int loop;
>>   	bool wait_early;
>>
>> +	pn->prev_cpu = pp->cpu;	/* Save vCPU # of previous queue entry */
>> +
>>   	/* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */
>>   	for (;; waitcnt++) {
>>   		for (wait_early = false, loop = SPIN_THRESHOLD; loop; loop--) {
>> @@ -314,10 +363,23 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>>   		smp_store_mb(pn->state, vcpu_halted);
>>
>>   		if (!READ_ONCE(node->locked)) {
>> +			struct pv_node *ph;
>> +
>>   			qstat_inc(qstat_pv_wait_node, true);
>>   			qstat_inc(qstat_pv_wait_again, waitcnt);
>>   			qstat_inc(qstat_pv_wait_early, wait_early);
>> -			pv_wait(&pn->state, vcpu_halted);
>> +
>> +			/*
>> +			 * If the current queue head is in the hash table,
>> +			 * the prev_cpu field of its pv_node may contain the
>> +			 * vCPU # of the lock holder. However, lock stealing
>> +			 * may make that information inaccurate. Anyway, we
>> +			 * look up the hash table to try to get the lock
>> +			 * holder vCPU number.
>> +			 */
>> +			ph = pv_lookup_hash(lock);
> I am thinking, could we save the hash lookup here if wait_early == true?
> Because in that case, the previous node is likely to get the lock soon,
> it makes sense we yield to that node rather than the holder, so may we
> can do something like:
>
> 			if (wait_early)
> 				pv_wait(&pn->state, vcpu_halted, pn->prev_cpu);
> 			else {
> 				ph = pv_lookup_hash(lock);
> 				pv_wait(&pn->state, vcpu_halted,
> 					ph ? ph->prev_cpu : -1);
> 			}
>
> Thoughts?
>
> Regards,
> Boqun
>

One reason why I don't want to just wake up the previous node vCPU is 
because I am concerned that a vCPU may go into a suspend-wakeup loop. So 
I will wait until we get experimental evidence that this really help 
performance before we commit to this kind of change.

Cheers,
Longman

      reply	other threads:[~2016-04-28 20:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 15:02 [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait() Waiman Long
2016-04-28 13:32 ` Boqun Feng
2016-04-28 20:50   ` Waiman Long [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57227791.2020908@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=boqun.feng@gmail.com \
    --cc=doug.hatch@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hpe.com \
    --cc=xinhui@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox