From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 02/10] locking/qspinlock: Remove unbounded cmpxchg loop from locking slowpath
Date: Fri, 6 Apr 2018 14:09:53 -0700 [thread overview]
Message-ID: <20180406210953.GA24165@linux.vnet.ibm.com> (raw)
In-Reply-To: <dc5f5e43-a60a-05f2-16fb-46960c40459e@redhat.com>
On Fri, Apr 06, 2018 at 04:50:19PM -0400, Waiman Long wrote:
> On 04/05/2018 12:58 PM, Will Deacon wrote:
> > The qspinlock locking slowpath utilises a "pending" bit as a simple form
> > of an embedded test-and-set lock that can avoid the overhead of explicit
> > queuing in cases where the lock is held but uncontended. This bit is
> > managed using a cmpxchg loop which tries to transition the uncontended
> > lock word from (0,0,0) -> (0,0,1) or (0,0,1) -> (0,1,1).
> >
> > Unfortunately, the cmpxchg loop is unbounded and lockers can be starved
> > indefinitely if the lock word is seen to oscillate between unlocked
> > (0,0,0) and locked (0,0,1). This could happen if concurrent lockers are
> > able to take the lock in the cmpxchg loop without queuing and pass it
> > around amongst themselves.
> >
> > This patch fixes the problem by unconditionally setting _Q_PENDING_VAL
> > using atomic_fetch_or, and then inspecting the old value to see whether
> > we need to spin on the current lock owner, or whether we now effectively
> > hold the lock. The tricky scenario is when concurrent lockers end up
> > queuing on the lock and the lock becomes available, causing us to see
> > a lockword of (n,0,0). With pending now set, simply queuing could lead
> > to deadlock as the head of the queue may not have observed the pending
> > flag being cleared. Conversely, if the head of the queue did observe
> > pending being cleared, then it could transition the lock from (n,0,0) ->
> > (0,0,1) meaning that any attempt to "undo" our setting of the pending
> > bit could race with a concurrent locker trying to set it.
> >
> > We handle this race by preserving the pending bit when taking the lock
> > after reaching the head of the queue and leaving the tail entry intact
> > if we saw pending set, because we know that the tail is going to be
> > updated shortly.
> >
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
>
> The pending bit was added to the qspinlock design to counter performance
> degradation compared with ticket lock for workloads with light
> spinlock contention. I run my spinlock stress test on a Intel Skylake
> server running the vanilla 4.16 kernel vs a patched kernel with this
> patchset. The locking rates with different number of locking threads
> were as follows:
>
> # of threads 4.16 kernel patched 4.16 kernel
> ------------ ----------- -------------------
> 1 7,417 kop/s 7,408 kop/s
> 2 5,755 kop/s 4,486 kop/s
> 3 4,214 kop/s 4,169 kop/s
> 4 4,396 kop/s 4,383 kop/s
>
> The 2 contending threads case is the one that exercise the pending bit
> code path the most. So it is obvious that this is the one that is most
> impacted by this patchset. The differences in the other cases are mostly
> noise or maybe just a little bit on the 3 contending threads case.
>
> I am not against this patch, but we certainly need to find out a way to
> bring the performance number up closer to what it is before applying
> the patch.
It would indeed be good to not be in the position of having to trade off
forward-progress guarantees against performance, but that does appear to
be where we are at the moment.
Thanx, Paul
next prev parent reply other threads:[~2018-04-06 21:09 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-05 16:58 [PATCH 00/10] kernel/locking: qspinlock improvements Will Deacon
2018-04-05 16:58 ` [PATCH 01/10] locking/qspinlock: Don't spin on pending->locked transition in slowpath Will Deacon
2018-04-05 16:58 ` [PATCH 02/10] locking/qspinlock: Remove unbounded cmpxchg loop from locking slowpath Will Deacon
2018-04-05 17:07 ` Peter Zijlstra
2018-04-06 15:08 ` Will Deacon
2018-04-05 17:13 ` Peter Zijlstra
2018-04-05 21:16 ` Waiman Long
2018-04-06 15:08 ` Will Deacon
2018-04-06 20:50 ` Waiman Long
2018-04-06 21:09 ` Paul E. McKenney [this message]
2018-04-07 8:47 ` Peter Zijlstra
2018-04-07 23:37 ` Paul E. McKenney
2018-04-09 10:58 ` Will Deacon
2018-04-07 9:07 ` Peter Zijlstra
2018-04-09 10:58 ` Will Deacon
2018-04-09 14:54 ` Will Deacon
2018-04-09 15:54 ` Peter Zijlstra
2018-04-09 17:19 ` Will Deacon
2018-04-10 9:35 ` Peter Zijlstra
2018-09-20 16:08 ` Peter Zijlstra
2018-09-20 16:22 ` Peter Zijlstra
2018-04-09 19:33 ` Waiman Long
2018-04-09 17:55 ` Waiman Long
2018-04-10 13:49 ` Sasha Levin
2018-04-05 16:59 ` [PATCH 03/10] locking/qspinlock: Kill cmpxchg loop when claiming lock from head of queue Will Deacon
2018-04-05 17:19 ` Peter Zijlstra
2018-04-06 10:54 ` Will Deacon
2018-04-05 16:59 ` [PATCH 04/10] locking/qspinlock: Use atomic_cond_read_acquire Will Deacon
2018-04-05 16:59 ` [PATCH 05/10] locking/mcs: Use smp_cond_load_acquire() in mcs spin loop Will Deacon
2018-04-05 16:59 ` [PATCH 06/10] barriers: Introduce smp_cond_load_relaxed and atomic_cond_read_relaxed Will Deacon
2018-04-05 17:22 ` Peter Zijlstra
2018-04-06 10:55 ` Will Deacon
2018-04-05 16:59 ` [PATCH 07/10] locking/qspinlock: Use smp_cond_load_relaxed to wait for next node Will Deacon
2018-04-05 16:59 ` [PATCH 08/10] locking/qspinlock: Merge struct __qspinlock into struct qspinlock Will Deacon
2018-04-07 5:23 ` Boqun Feng
2018-04-05 16:59 ` [PATCH 09/10] locking/qspinlock: Make queued_spin_unlock use smp_store_release Will Deacon
2018-04-05 16:59 ` [PATCH 10/10] locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb() Will Deacon
2018-04-05 17:28 ` Peter Zijlstra
2018-04-06 11:34 ` Will Deacon
2018-04-06 13:05 ` Andrea Parri
2018-04-06 15:27 ` Will Deacon
2018-04-06 15:49 ` Andrea Parri
2018-04-07 5:47 ` Boqun Feng
2018-04-09 10:47 ` Will Deacon
2018-04-06 13:22 ` [PATCH 00/10] kernel/locking: qspinlock improvements Andrea Parri
2018-04-11 10:20 ` Catalin Marinas
2018-04-11 15:39 ` Andrea Parri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180406210953.GA24165@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).