public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Waiman Long <Waiman.Long@hpe.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	Pan Xinhui <xinhui@linux.vnet.ibm.com>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait()
Date: Thu, 28 Apr 2016 21:32:13 +0800	[thread overview]
Message-ID: <20160428133213.GC25392@insomnia> (raw)
In-Reply-To: <1461250925-39599-1-git-send-email-Waiman.Long@hpe.com>

[-- Attachment #1: Type: text/plain, Size: 4115 bytes --]

Hi Waiman,

On Thu, Apr 21, 2016 at 11:02:05AM -0400, Waiman Long wrote:
> Pan Xinhui was asking for a lock holder cpu argument in pv_wait()
> to help the porting of pvqspinlock to PPC. The new argument will can
> help hypervisor expediate the execution of the critical section by
> the lock holder, if its vCPU isn't running, so that it can release
> the lock sooner.
> 
> The pv_wait() function is now extended to have a third argument
> that holds the vCPU number of the lock holder, if this is
> known. Architecture specific hypervisor code can then make use of
> that information to make better decision of which vCPU should be
> running next.
> 
> This patch also adds a new field in the pv_node structure to hold
> the vCPU number of the previous queue entry. For the queue head,
> the prev_cpu entry is likely to be the that of the lock holder,
> though lock stealing may render this information inaccurate.
> 
> In pv_wait_head_or_lock(), pv_wait() will now be called with that
> vCPU number as it is likely to be the lock holder.
> 
> In pv_wait_node(), the newly added pv_lookup_hash() function will
> be called to look up the queue head and pass in the lock holder vCPU
> number stored there, if found.
> 
> Suggested-by:  Pan Xinhui <xinhui@linux.vnet.ibm.com>
> Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
> ---
> 
>  v2->v3:
>   - Rephrase the changelog and some of the comments to make the
>     intention of this patch more clear.
>   - Make pv_lookup_hash() architecture dependent to eliminate its cost
>     if an architecture doesn't need it. Also make the number of
>     cachelines searched in pv_lookup_hash() to between 1-4.
>   - [RESEND] Fix typo error.
> 
>  v1->v2:
>   - In pv_wait_node(), lookup the hash table for the queue head and pass
>     the lock holder cpu number to pv_wait().
> 

[snip]

> @@ -282,7 +328,8 @@ static void pv_init_node(struct mcs_spinlock *node)
>   * pv_kick_node() is used to set _Q_SLOW_VAL and fill in hash table on its
>   * behalf.
>   */
> -static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> +static void pv_wait_node(struct qspinlock *lock, struct mcs_spinlock *node,
> +			 struct mcs_spinlock *prev)
>  {
>  	struct pv_node *pn = (struct pv_node *)node;
>  	struct pv_node *pp = (struct pv_node *)prev;
> @@ -290,6 +337,8 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>  	int loop;
>  	bool wait_early;
>  
> +	pn->prev_cpu = pp->cpu;	/* Save vCPU # of previous queue entry */
> +
>  	/* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */
>  	for (;; waitcnt++) {
>  		for (wait_early = false, loop = SPIN_THRESHOLD; loop; loop--) {
> @@ -314,10 +363,23 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>  		smp_store_mb(pn->state, vcpu_halted);
>  
>  		if (!READ_ONCE(node->locked)) {
> +			struct pv_node *ph;
> +
>  			qstat_inc(qstat_pv_wait_node, true);
>  			qstat_inc(qstat_pv_wait_again, waitcnt);
>  			qstat_inc(qstat_pv_wait_early, wait_early);
> -			pv_wait(&pn->state, vcpu_halted);
> +
> +			/*
> +			 * If the current queue head is in the hash table,
> +			 * the prev_cpu field of its pv_node may contain the
> +			 * vCPU # of the lock holder. However, lock stealing
> +			 * may make that information inaccurate. Anyway, we
> +			 * look up the hash table to try to get the lock
> +			 * holder vCPU number.
> +			 */
> +			ph = pv_lookup_hash(lock);

I am thinking, could we save the hash lookup here if wait_early == true?
Because in that case, the previous node is likely to get the lock soon,
it makes sense we yield to that node rather than the holder, so may we
can do something like:

			if (wait_early)
				pv_wait(&pn->state, vcpu_halted, pn->prev_cpu);
			else {
				ph = pv_lookup_hash(lock);
				pv_wait(&pn->state, vcpu_halted,
					ph ? ph->prev_cpu : -1);
			}

Thoughts?

Regards,
Boqun

> +			pv_wait(&pn->state, vcpu_halted,
> +				ph ? ph->prev_cpu : -1);
>  		}
>  
>  		/*

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2016-04-28 13:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 15:02 [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait() Waiman Long
2016-04-28 13:32 ` Boqun Feng [this message]
2016-04-28 20:50   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160428133213.GC25392@insomnia \
    --to=boqun.feng@gmail.com \
    --cc=Waiman.Long@hpe.com \
    --cc=doug.hatch@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hpe.com \
    --cc=xinhui@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox