From: Ingo Molnar <mingo@kernel.org>
To: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org,
Pan Xinhui <xinhui@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andrea Parri <parri.andrea@gmail.com>
Subject: Re: [PATCH v4] locking/pvqspinlock: Relax cmpxchg's to improve performance on some archs
Date: Tue, 21 Feb 2017 09:20:35 +0100 [thread overview]
Message-ID: <20170221082034.GA10137@gmail.com> (raw)
In-Reply-To: <1487607451-22688-1-git-send-email-longman@redhat.com>
* Waiman Long <longman@redhat.com> wrote:
> All the locking related cmpxchg's in the following functions are
> replaced with the _acquire variants:
> - pv_queued_spin_steal_lock()
> - trylock_clear_pending()
>
> This change should help performance on architectures that use LL/SC.
>
> On a 2-core 16-thread Power8 system with pvqspinlock explicitly
> enabled, the performance of a locking microbenchmark with and without
> this patch on a 4.10-rc8 kernel with Xinhui's PPC qspinlock patch
> were as follows:
>
> # of thread w/o patch with patch % Change
> ----------- --------- ---------- --------
> 4 4053.3 Mop/s 4223.7 Mop/s +4.2%
> 8 3310.4 Mop/s 3406.0 Mop/s +2.9%
> 12 2576.4 Mop/s 2674.6 Mop/s +3.8%
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>
> v3->v4:
> - Update the comment in pv_kick_node() to mention that the code
> may not work in some archs.
>
> v2->v3:
> - Reduce scope by relaxing cmpxchg's in fast path only.
>
> v1->v2:
> - Add comments in changelog and code for the rationale of the change.
>
> kernel/locking/qspinlock_paravirt.h | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> index e6b2f7a..93e271d 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -72,7 +72,7 @@ static inline bool pv_queued_spin_steal_lock(struct qspinlock *lock)
> struct __qspinlock *l = (void *)lock;
>
> if (!(atomic_read(&lock->val) & _Q_LOCKED_PENDING_MASK) &&
> - (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) {
> + (cmpxchg_acquire(&l->locked, 0, _Q_LOCKED_VAL) == 0)) {
> qstat_inc(qstat_pv_lock_stealing, true);
> return true;
> }
> @@ -101,16 +101,16 @@ static __always_inline void clear_pending(struct qspinlock *lock)
>
> /*
> * The pending bit check in pv_queued_spin_steal_lock() isn't a memory
> - * barrier. Therefore, an atomic cmpxchg() is used to acquire the lock
> - * just to be sure that it will get it.
> + * barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the
> + * lock just to be sure that it will get it.
> */
> static __always_inline int trylock_clear_pending(struct qspinlock *lock)
> {
> struct __qspinlock *l = (void *)lock;
>
> return !READ_ONCE(l->locked) &&
> - (cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL)
> - == _Q_PENDING_VAL);
> + (cmpxchg_acquire(&l->locked_pending, _Q_PENDING_VAL,
> + _Q_LOCKED_VAL) == _Q_PENDING_VAL);
> }
> #else /* _Q_PENDING_BITS == 8 */
> static __always_inline void set_pending(struct qspinlock *lock)
> @@ -138,7 +138,7 @@ static __always_inline int trylock_clear_pending(struct qspinlock *lock)
> */
> old = val;
> new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL;
> - val = atomic_cmpxchg(&lock->val, old, new);
> + val = atomic_cmpxchg_acquire(&lock->val, old, new);
>
> if (val == old)
> return 1;
> @@ -361,6 +361,13 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
> * observe its next->locked value and advance itself.
> *
> * Matches with smp_store_mb() and cmpxchg() in pv_wait_node()
> + *
> + * The write to next->locked in arch_mcs_spin_unlock_contended()
> + * must be ordered before the read of pn->state in the cmpxchg()
> + * below for the code to work correctly. However, this is not
> + * guaranteed in all architectures when the cmpxchg() fails.
> + * Both x86 and PPC can provide that guarantee, but probably
> + * not in some other architectures.
ob- spelling pedantry:
s/not guaranteed in all architectures
/not guaranteed on all architectures
s/not in some other architectures
/not on some other architectures
Also, "the cmpxchg() fails" should either be "when the cmpxchg() call fails" or
"when cmpxchg() fails".
Plus the last sentence reads a bit funny.
Something like this would work for me:
* The write to next->locked in arch_mcs_spin_unlock_contended()
* must be ordered before the read of pn->state in the cmpxchg()
* below for the code to work correctly. However, this is not
* guaranteed on all architectures when cmpxchg() fails.
* Both x86 and PPC can provide that guarantee, but other
* architectures not necessarily.
(or so)
Thanks,
Ingo
prev parent reply other threads:[~2017-02-21 8:20 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-20 16:17 [PATCH v4] locking/pvqspinlock: Relax cmpxchg's to improve performance on some archs Waiman Long
2017-02-21 8:20 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170221082034.GA10137@gmail.com \
--to=mingo@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=parri.andrea@gmail.com \
--cc=peterz@infradead.org \
--cc=xinhui@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.