From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753253AbcD1N3M (ORCPT ); Thu, 28 Apr 2016 09:29:12 -0400 Received: from mail-ig0-f175.google.com ([209.85.213.175]:35800 "EHLO mail-ig0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752607AbcD1N3J (ORCPT ); Thu, 28 Apr 2016 09:29:09 -0400 Date: Thu, 28 Apr 2016 21:32:13 +0800 From: Boqun Feng To: Waiman Long Cc: Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, Pan Xinhui , Scott J Norton , Douglas Hatch Subject: Re: [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait() Message-ID: <20160428133213.GC25392@insomnia> References: <1461250925-39599-1-git-send-email-Waiman.Long@hpe.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="UFHRwCdBEJvubb2X" Content-Disposition: inline In-Reply-To: <1461250925-39599-1-git-send-email-Waiman.Long@hpe.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --UFHRwCdBEJvubb2X Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Waiman, On Thu, Apr 21, 2016 at 11:02:05AM -0400, Waiman Long wrote: > Pan Xinhui was asking for a lock holder cpu argument in pv_wait() > to help the porting of pvqspinlock to PPC. The new argument will can > help hypervisor expediate the execution of the critical section by > the lock holder, if its vCPU isn't running, so that it can release > the lock sooner. >=20 > The pv_wait() function is now extended to have a third argument > that holds the vCPU number of the lock holder, if this is > known. Architecture specific hypervisor code can then make use of > that information to make better decision of which vCPU should be > running next. >=20 > This patch also adds a new field in the pv_node structure to hold > the vCPU number of the previous queue entry. For the queue head, > the prev_cpu entry is likely to be the that of the lock holder, > though lock stealing may render this information inaccurate. >=20 > In pv_wait_head_or_lock(), pv_wait() will now be called with that > vCPU number as it is likely to be the lock holder. >=20 > In pv_wait_node(), the newly added pv_lookup_hash() function will > be called to look up the queue head and pass in the lock holder vCPU > number stored there, if found. >=20 > Suggested-by: Pan Xinhui > Signed-off-by: Waiman Long > --- >=20 > v2->v3: > - Rephrase the changelog and some of the comments to make the > intention of this patch more clear. > - Make pv_lookup_hash() architecture dependent to eliminate its cost > if an architecture doesn't need it. Also make the number of > cachelines searched in pv_lookup_hash() to between 1-4. > - [RESEND] Fix typo error. >=20 > v1->v2: > - In pv_wait_node(), lookup the hash table for the queue head and pass > the lock holder cpu number to pv_wait(). >=20 [snip] > @@ -282,7 +328,8 @@ static void pv_init_node(struct mcs_spinlock *node) > * pv_kick_node() is used to set _Q_SLOW_VAL and fill in hash table on i= ts > * behalf. > */ > -static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock = *prev) > +static void pv_wait_node(struct qspinlock *lock, struct mcs_spinlock *no= de, > + struct mcs_spinlock *prev) > { > struct pv_node *pn =3D (struct pv_node *)node; > struct pv_node *pp =3D (struct pv_node *)prev; > @@ -290,6 +337,8 @@ static void pv_wait_node(struct mcs_spinlock *node, s= truct mcs_spinlock *prev) > int loop; > bool wait_early; > =20 > + pn->prev_cpu =3D pp->cpu; /* Save vCPU # of previous queue entry */ > + > /* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */ > for (;; waitcnt++) { > for (wait_early =3D false, loop =3D SPIN_THRESHOLD; loop; loop--) { > @@ -314,10 +363,23 @@ static void pv_wait_node(struct mcs_spinlock *node,= struct mcs_spinlock *prev) > smp_store_mb(pn->state, vcpu_halted); > =20 > if (!READ_ONCE(node->locked)) { > + struct pv_node *ph; > + > qstat_inc(qstat_pv_wait_node, true); > qstat_inc(qstat_pv_wait_again, waitcnt); > qstat_inc(qstat_pv_wait_early, wait_early); > - pv_wait(&pn->state, vcpu_halted); > + > + /* > + * If the current queue head is in the hash table, > + * the prev_cpu field of its pv_node may contain the > + * vCPU # of the lock holder. However, lock stealing > + * may make that information inaccurate. Anyway, we > + * look up the hash table to try to get the lock > + * holder vCPU number. > + */ > + ph =3D pv_lookup_hash(lock); I am thinking, could we save the hash lookup here if wait_early =3D=3D true? Because in that case, the previous node is likely to get the lock soon, it makes sense we yield to that node rather than the holder, so may we can do something like: if (wait_early) pv_wait(&pn->state, vcpu_halted, pn->prev_cpu); else { ph =3D pv_lookup_hash(lock); pv_wait(&pn->state, vcpu_halted, ph ? ph->prev_cpu : -1); } Thoughts? Regards, Boqun > + pv_wait(&pn->state, vcpu_halted, > + ph ? ph->prev_cpu : -1); > } > =20 > /* [snip] --UFHRwCdBEJvubb2X Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXIhDZAAoJEEl56MO1B/q4skoH/jeqLRmS9UulIEXlngp1dHlU ztPJCz9hlpeP6T1fa4KhemUpw1NP9RFvkFGuwl/g77ygnJnSJu2CEUNIc2XnghpY LbV+mHAMosEPAlu0ePXJpmgKbh3hN/BN6B7kWEZYBM2IAjzkap3YfzTVDWfgUsSA CjWrv4UhtUKWE80a5cPubq1n6JHqaVmaQgIvW786h/5lL6dhQOmYppwQlJvTdTxd ncBM5pEQ24pRRYKPSCTRIY77nm5H242LIRJc9N7AIrUxBN1j0ndvGaVzEURg6168 hthvmFgzPHQJxHiHSn3H616VWwrXkWapPwR3ftvPnQArZUPpU2fMW14JTLuatxo= =UBTY -----END PGP SIGNATURE----- --UFHRwCdBEJvubb2X--