From: Paolo Bonzini <pbonzini@redhat.com> To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Waiman.Long@hp.com, linux-arch@vger.kernel.org, riel@redhat.com, gleb@redhat.com, kvm@vger.kernel.org, boris.ostrovsky@oracle.com, scott.norton@hp.com, raghavendra.kt@linux.vnet.ibm.com, paolo.bonzini@gmail.com, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Peter Zijlstra <peterz@infradead.org>, chegu_vinod@hp.com, david.vrabel@citrix.com, oleg@redhat.com, xen-devel@lists.xenproject.org, tglx@linutronix.de, paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org, mingo@kernel.org Subject: Re: [PATCH 04/11] qspinlock: Extract out the exchange of tail code word Date: Wed, 18 Jun 2014 13:37:45 +0200 [thread overview] Message-ID: <53A17A09.6010007@redhat.com> (raw) In-Reply-To: <20140617205525.GB29634@laptop.dumpdata.com> Il 17/06/2014 22:55, Konrad Rzeszutek Wilk ha scritto: > On Sun, Jun 15, 2014 at 02:47:01PM +0200, Peter Zijlstra wrote: >> From: Waiman Long <Waiman.Long@hp.com> >> >> This patch extracts the logic for the exchange of new and previous tail >> code words into a new xchg_tail() function which can be optimized in a >> later patch. > > And also adds a third try on acquiring the lock. That I think should > be a seperate patch. It doesn't really add a new try, the old code is: - for (;;) { - new = _Q_LOCKED_VAL; - if (val) - new = tail | (val & _Q_LOCKED_PENDING_MASK); - - old = atomic_cmpxchg(&lock->val, val, new); - if (old == val) - break; - - val = old; - } /* - * we won the trylock; forget about queueing. */ - if (new == _Q_LOCKED_VAL) - goto release; The trylock happens if the "if (val)" hits the else branch. What the patch does is change it from attempting two transition with a single cmpxchg: - * 0,0,0 -> 0,0,1 ; trylock - * p,y,x -> n,y,x ; prev = xchg(lock, node) to first doing the trylock, then the xchg. If the trylock passes and the xchg returns prev=0,0,0, the next step of the algorithm goes to the locked/uncontended state + /* + * claim the lock: + * + * n,0 -> 0,1 : lock, uncontended Similar to your suggestion of patch 3, it's expected that the xchg will *not* return prev=0,0,0 after a failed trylock. However, I *do* agree with you that it's simpler to just squash this patch into 01/11. Paolo > And instead of saying 'later patch' you should spell out the name > of the patch. Especially as this might not be obvious from somebody > doing git bisection. > >> >> Signed-off-by: Waiman Long <Waiman.Long@hp.com> >> Signed-off-by: Peter Zijlstra <peterz@infradead.org> >> --- >> include/asm-generic/qspinlock_types.h | 2 + >> kernel/locking/qspinlock.c | 58 +++++++++++++++++++++------------- >> 2 files changed, 38 insertions(+), 22 deletions(-) >> >> --- a/include/asm-generic/qspinlock_types.h >> +++ b/include/asm-generic/qspinlock_types.h >> @@ -61,6 +61,8 @@ typedef struct qspinlock { >> #define _Q_TAIL_CPU_BITS (32 - _Q_TAIL_CPU_OFFSET) >> #define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) >> >> +#define _Q_TAIL_MASK (_Q_TAIL_IDX_MASK | _Q_TAIL_CPU_MASK) >> + >> #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) >> #define _Q_PENDING_VAL (1U << _Q_PENDING_OFFSET) >> >> --- a/kernel/locking/qspinlock.c >> +++ b/kernel/locking/qspinlock.c >> @@ -86,6 +86,31 @@ static inline struct mcs_spinlock *decod >> #define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK) >> >> /** >> + * xchg_tail - Put in the new queue tail code word & retrieve previous one >> + * @lock : Pointer to queue spinlock structure >> + * @tail : The new queue tail code word >> + * Return: The previous queue tail code word >> + * >> + * xchg(lock, tail) >> + * >> + * p,*,* -> n,*,* ; prev = xchg(lock, node) >> + */ >> +static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) >> +{ >> + u32 old, new, val = atomic_read(&lock->val); >> + >> + for (;;) { >> + new = (val & _Q_LOCKED_PENDING_MASK) | tail; >> + old = atomic_cmpxchg(&lock->val, val, new); >> + if (old == val) >> + break; >> + >> + val = old; >> + } >> + return old; >> +} >> + >> +/** >> * queue_spin_lock_slowpath - acquire the queue spinlock >> * @lock: Pointer to queue spinlock structure >> * @val: Current value of the queue spinlock 32-bit word >> @@ -182,36 +207,25 @@ void queue_spin_lock_slowpath(struct qsp >> node->next = NULL; >> >> /* >> - * we already touched the queueing cacheline; don't bother with pending >> - * stuff. >> - * >> - * trylock || xchg(lock, node) >> - * >> - * 0,0,0 -> 0,0,1 ; trylock >> - * p,y,x -> n,y,x ; prev = xchg(lock, node) >> + * We touched a (possibly) cold cacheline in the per-cpu queue node; >> + * attempt the trylock once more in the hope someone let go while we >> + * weren't watching. >> */ >> - for (;;) { >> - new = _Q_LOCKED_VAL; >> - if (val) >> - new = tail | (val & _Q_LOCKED_PENDING_MASK); >> - >> - old = atomic_cmpxchg(&lock->val, val, new); >> - if (old == val) >> - break; >> - >> - val = old; >> - } >> + if (queue_spin_trylock(lock)) >> + goto release; > > So now are three of them? One in queue_spin_lock, then at the start > of this function when checking for the pending bit, and the once more > here. And that is because the local cache line might be cold for the > 'mcs_index' struct? > > That all seems to be a bit of experimental. But then we are already > in the slowpath so we could as well do: > > for (i = 0; i < 10; i++) > if (queue_spin_trylock(lock)) > goto release; > > And would have the same effect. > > >> >> /* >> - * we won the trylock; forget about queueing. >> + * we already touched the queueing cacheline; don't bother with pending >> + * stuff. > > I guess we could also just erase the pending bit if we wanted too. The > optimistic spinning will still hit go to the queue label as lock->val will > have the tail value. > >> + * >> + * p,*,* -> n,*,* >> */ >> - if (new == _Q_LOCKED_VAL) >> - goto release; >> + old = xchg_tail(lock, tail); >> >> /* >> * if there was a previous node; link it and wait. >> */ >> - if (old & ~_Q_LOCKED_PENDING_MASK) { >> + if (old & _Q_TAIL_MASK) { >> prev = decode_tail(old); >> ACCESS_ONCE(prev->next) = node; >> >> >>
WARNING: multiple messages have this Message-ID (diff)
From: Paolo Bonzini <pbonzini@redhat.com> To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Waiman.Long@hp.com, tglx@linutronix.de, mingo@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, paolo.bonzini@gmail.com, boris.ostrovsky@oracle.com, paulmck@linux.vnet.ibm.com, riel@redhat.com, torvalds@linux-foundation.org, raghavendra.kt@linux.vnet.ibm.com, david.vrabel@citrix.com, oleg@redhat.com, gleb@redhat.com, scott.norton@hp.com, chegu_vinod@hp.com, Peter Zijlstra <peterz@infradead.org> Subject: Re: [PATCH 04/11] qspinlock: Extract out the exchange of tail code word Date: Wed, 18 Jun 2014 13:37:45 +0200 [thread overview] Message-ID: <53A17A09.6010007@redhat.com> (raw) Message-ID: <20140618113745.2wiKmK4sSd1s7uzdXKEOw_D-MR7NzFRz7Nf7JYM4bTs@z> (raw) In-Reply-To: <20140617205525.GB29634@laptop.dumpdata.com> Il 17/06/2014 22:55, Konrad Rzeszutek Wilk ha scritto: > On Sun, Jun 15, 2014 at 02:47:01PM +0200, Peter Zijlstra wrote: >> From: Waiman Long <Waiman.Long@hp.com> >> >> This patch extracts the logic for the exchange of new and previous tail >> code words into a new xchg_tail() function which can be optimized in a >> later patch. > > And also adds a third try on acquiring the lock. That I think should > be a seperate patch. It doesn't really add a new try, the old code is: - for (;;) { - new = _Q_LOCKED_VAL; - if (val) - new = tail | (val & _Q_LOCKED_PENDING_MASK); - - old = atomic_cmpxchg(&lock->val, val, new); - if (old == val) - break; - - val = old; - } /* - * we won the trylock; forget about queueing. */ - if (new == _Q_LOCKED_VAL) - goto release; The trylock happens if the "if (val)" hits the else branch. What the patch does is change it from attempting two transition with a single cmpxchg: - * 0,0,0 -> 0,0,1 ; trylock - * p,y,x -> n,y,x ; prev = xchg(lock, node) to first doing the trylock, then the xchg. If the trylock passes and the xchg returns prev=0,0,0, the next step of the algorithm goes to the locked/uncontended state + /* + * claim the lock: + * + * n,0 -> 0,1 : lock, uncontended Similar to your suggestion of patch 3, it's expected that the xchg will *not* return prev=0,0,0 after a failed trylock. However, I *do* agree with you that it's simpler to just squash this patch into 01/11. Paolo > And instead of saying 'later patch' you should spell out the name > of the patch. Especially as this might not be obvious from somebody > doing git bisection. > >> >> Signed-off-by: Waiman Long <Waiman.Long@hp.com> >> Signed-off-by: Peter Zijlstra <peterz@infradead.org> >> --- >> include/asm-generic/qspinlock_types.h | 2 + >> kernel/locking/qspinlock.c | 58 +++++++++++++++++++++------------- >> 2 files changed, 38 insertions(+), 22 deletions(-) >> >> --- a/include/asm-generic/qspinlock_types.h >> +++ b/include/asm-generic/qspinlock_types.h >> @@ -61,6 +61,8 @@ typedef struct qspinlock { >> #define _Q_TAIL_CPU_BITS (32 - _Q_TAIL_CPU_OFFSET) >> #define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) >> >> +#define _Q_TAIL_MASK (_Q_TAIL_IDX_MASK | _Q_TAIL_CPU_MASK) >> + >> #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) >> #define _Q_PENDING_VAL (1U << _Q_PENDING_OFFSET) >> >> --- a/kernel/locking/qspinlock.c >> +++ b/kernel/locking/qspinlock.c >> @@ -86,6 +86,31 @@ static inline struct mcs_spinlock *decod >> #define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK) >> >> /** >> + * xchg_tail - Put in the new queue tail code word & retrieve previous one >> + * @lock : Pointer to queue spinlock structure >> + * @tail : The new queue tail code word >> + * Return: The previous queue tail code word >> + * >> + * xchg(lock, tail) >> + * >> + * p,*,* -> n,*,* ; prev = xchg(lock, node) >> + */ >> +static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) >> +{ >> + u32 old, new, val = atomic_read(&lock->val); >> + >> + for (;;) { >> + new = (val & _Q_LOCKED_PENDING_MASK) | tail; >> + old = atomic_cmpxchg(&lock->val, val, new); >> + if (old == val) >> + break; >> + >> + val = old; >> + } >> + return old; >> +} >> + >> +/** >> * queue_spin_lock_slowpath - acquire the queue spinlock >> * @lock: Pointer to queue spinlock structure >> * @val: Current value of the queue spinlock 32-bit word >> @@ -182,36 +207,25 @@ void queue_spin_lock_slowpath(struct qsp >> node->next = NULL; >> >> /* >> - * we already touched the queueing cacheline; don't bother with pending >> - * stuff. >> - * >> - * trylock || xchg(lock, node) >> - * >> - * 0,0,0 -> 0,0,1 ; trylock >> - * p,y,x -> n,y,x ; prev = xchg(lock, node) >> + * We touched a (possibly) cold cacheline in the per-cpu queue node; >> + * attempt the trylock once more in the hope someone let go while we >> + * weren't watching. >> */ >> - for (;;) { >> - new = _Q_LOCKED_VAL; >> - if (val) >> - new = tail | (val & _Q_LOCKED_PENDING_MASK); >> - >> - old = atomic_cmpxchg(&lock->val, val, new); >> - if (old == val) >> - break; >> - >> - val = old; >> - } >> + if (queue_spin_trylock(lock)) >> + goto release; > > So now are three of them? One in queue_spin_lock, then at the start > of this function when checking for the pending bit, and the once more > here. And that is because the local cache line might be cold for the > 'mcs_index' struct? > > That all seems to be a bit of experimental. But then we are already > in the slowpath so we could as well do: > > for (i = 0; i < 10; i++) > if (queue_spin_trylock(lock)) > goto release; > > And would have the same effect. > > >> >> /* >> - * we won the trylock; forget about queueing. >> + * we already touched the queueing cacheline; don't bother with pending >> + * stuff. > > I guess we could also just erase the pending bit if we wanted too. The > optimistic spinning will still hit go to the queue label as lock->val will > have the tail value. > >> + * >> + * p,*,* -> n,*,* >> */ >> - if (new == _Q_LOCKED_VAL) >> - goto release; >> + old = xchg_tail(lock, tail); >> >> /* >> * if there was a previous node; link it and wait. >> */ >> - if (old & ~_Q_LOCKED_PENDING_MASK) { >> + if (old & _Q_TAIL_MASK) { >> prev = decode_tail(old); >> ACCESS_ONCE(prev->next) = node; >> >> >>
next prev parent reply other threads:[~2014-06-18 11:37 UTC|newest] Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-06-15 12:46 [PATCH 00/11] qspinlock with paravirt support Peter Zijlstra 2014-06-15 12:46 ` Peter Zijlstra 2014-06-15 12:46 ` [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock Peter Zijlstra 2014-06-15 12:46 ` Peter Zijlstra 2014-06-16 20:49 ` Konrad Rzeszutek Wilk 2014-06-16 20:49 ` Konrad Rzeszutek Wilk 2014-06-17 20:03 ` Konrad Rzeszutek Wilk 2014-06-17 20:03 ` Konrad Rzeszutek Wilk 2014-06-23 16:12 ` Peter Zijlstra 2014-06-23 16:12 ` Peter Zijlstra 2014-06-23 16:20 ` Konrad Rzeszutek Wilk 2014-06-23 16:20 ` Konrad Rzeszutek Wilk 2014-06-23 15:56 ` Peter Zijlstra 2014-06-23 16:16 ` Konrad Rzeszutek Wilk 2014-06-23 16:16 ` Konrad Rzeszutek Wilk 2014-06-17 20:05 ` Konrad Rzeszutek Wilk 2014-06-17 20:05 ` Konrad Rzeszutek Wilk 2014-06-23 16:26 ` Peter Zijlstra 2014-06-23 16:26 ` Peter Zijlstra 2014-06-23 16:45 ` Konrad Rzeszutek Wilk 2014-06-23 16:45 ` Konrad Rzeszutek Wilk 2014-06-15 12:46 ` [PATCH 02/11] qspinlock, x86: Enable x86-64 to use " Peter Zijlstra 2014-06-15 12:46 ` Peter Zijlstra 2014-06-15 12:47 ` [PATCH 03/11] qspinlock: Add pending bit Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-17 20:36 ` Konrad Rzeszutek Wilk 2014-06-17 20:36 ` Konrad Rzeszutek Wilk 2014-06-17 20:51 ` Waiman Long 2014-06-17 20:51 ` Waiman Long 2014-06-17 21:07 ` Konrad Rzeszutek Wilk 2014-06-17 21:07 ` Konrad Rzeszutek Wilk 2014-06-17 21:10 ` Konrad Rzeszutek Wilk 2014-06-17 21:10 ` Konrad Rzeszutek Wilk 2014-06-17 22:25 ` Waiman Long 2014-06-17 22:25 ` Waiman Long 2014-06-24 8:24 ` Peter Zijlstra 2014-06-24 8:24 ` Peter Zijlstra 2014-06-18 11:29 ` Paolo Bonzini 2014-06-18 11:29 ` Paolo Bonzini 2014-06-18 13:36 ` Konrad Rzeszutek Wilk 2014-06-18 13:36 ` Konrad Rzeszutek Wilk 2014-06-23 16:35 ` Peter Zijlstra 2014-06-23 16:35 ` Peter Zijlstra 2014-06-15 12:47 ` [PATCH 04/11] qspinlock: Extract out the exchange of tail code word Peter Zijlstra 2014-06-17 20:55 ` Konrad Rzeszutek Wilk 2014-06-17 20:55 ` Konrad Rzeszutek Wilk 2014-06-18 11:37 ` Paolo Bonzini [this message] 2014-06-18 11:37 ` Paolo Bonzini 2014-06-18 13:50 ` Konrad Rzeszutek Wilk 2014-06-18 13:50 ` Konrad Rzeszutek Wilk 2014-06-18 15:46 ` Waiman Long 2014-06-18 15:46 ` Waiman Long 2014-06-18 15:49 ` Paolo Bonzini 2014-06-18 15:49 ` Paolo Bonzini 2014-06-18 16:02 ` Konrad Rzeszutek Wilk 2014-06-18 16:02 ` Konrad Rzeszutek Wilk 2014-06-24 10:47 ` Peter Zijlstra 2014-06-24 10:47 ` Peter Zijlstra 2014-06-15 12:47 ` [PATCH 05/11] qspinlock: Optimize for smaller NR_CPUS Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-18 11:39 ` Paolo Bonzini 2014-06-18 11:39 ` Paolo Bonzini 2014-07-07 14:35 ` Peter Zijlstra 2014-07-07 14:35 ` Peter Zijlstra 2014-07-07 15:08 ` Paolo Bonzini 2014-07-07 15:08 ` Paolo Bonzini 2014-07-07 15:35 ` Peter Zijlstra 2014-07-07 15:35 ` Peter Zijlstra 2014-07-07 16:10 ` Paolo Bonzini 2014-07-07 16:10 ` Paolo Bonzini 2014-06-18 15:57 ` Konrad Rzeszutek Wilk 2014-06-18 15:57 ` Konrad Rzeszutek Wilk 2014-07-07 14:33 ` Peter Zijlstra 2014-07-07 14:33 ` Peter Zijlstra 2014-06-15 12:47 ` [PATCH 06/11] qspinlock: Optimize pending bit Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-18 11:42 ` Paolo Bonzini 2014-06-18 11:42 ` Paolo Bonzini 2014-06-15 12:47 ` [PATCH 07/11] qspinlock: Use a simple write to grab the lock, if applicable Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-18 16:36 ` Konrad Rzeszutek Wilk 2014-06-18 16:36 ` Konrad Rzeszutek Wilk 2014-07-07 14:51 ` Peter Zijlstra 2014-07-07 14:51 ` Peter Zijlstra 2014-06-15 12:47 ` [PATCH 08/11] qspinlock: Revert to test-and-set on hypervisors Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-16 21:57 ` Waiman Long 2014-06-18 16:40 ` Konrad Rzeszutek Wilk 2014-06-18 16:40 ` Konrad Rzeszutek Wilk 2014-06-15 12:47 ` [PATCH 09/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-18 16:43 ` Konrad Rzeszutek Wilk 2014-06-18 16:43 ` Konrad Rzeszutek Wilk 2014-06-15 12:47 ` [PATCH 10/11] qspinlock: Paravirt support Peter Zijlstra 2014-06-15 12:47 ` Peter Zijlstra 2014-06-16 22:08 ` Waiman Long 2014-06-18 12:03 ` Paolo Bonzini 2014-06-18 12:03 ` Paolo Bonzini 2014-06-18 15:26 ` Waiman Long 2014-06-18 15:26 ` Waiman Long 2014-07-07 15:20 ` Peter Zijlstra 2014-07-07 15:20 ` Peter Zijlstra 2014-07-07 15:20 ` Peter Zijlstra 2014-07-07 15:20 ` Peter Zijlstra 2014-06-17 0:53 ` Waiman Long 2014-06-17 0:53 ` Waiman Long 2014-06-18 12:04 ` Paolo Bonzini 2014-06-18 12:04 ` Paolo Bonzini 2014-06-20 13:46 ` Konrad Rzeszutek Wilk 2014-06-20 13:46 ` Konrad Rzeszutek Wilk 2014-07-07 15:27 ` Peter Zijlstra 2014-07-15 14:23 ` Konrad Rzeszutek Wilk 2014-07-15 14:23 ` Konrad Rzeszutek Wilk 2014-06-15 12:47 ` [PATCH 11/11] qspinlock, kvm: Add paravirt support Peter Zijlstra 2014-06-22 16:36 ` Raghavendra K T 2014-06-22 16:36 ` Raghavendra K T 2014-07-07 15:23 ` Peter Zijlstra 2014-07-07 15:23 ` Peter Zijlstra 2014-06-16 20:52 ` [PATCH 00/11] qspinlock with " Konrad Rzeszutek Wilk 2014-06-16 20:52 ` Konrad Rzeszutek Wilk
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=53A17A09.6010007@redhat.com \ --to=pbonzini@redhat.com \ --cc=Waiman.Long@hp.com \ --cc=a.p.zijlstra@chello.nl \ --cc=boris.ostrovsky@oracle.com \ --cc=chegu_vinod@hp.com \ --cc=david.vrabel@citrix.com \ --cc=gleb@redhat.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@kernel.org \ --cc=oleg@redhat.com \ --cc=paolo.bonzini@gmail.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=peterz@infradead.org \ --cc=raghavendra.kt@linux.vnet.ibm.com \ --cc=riel@redhat.com \ --cc=scott.norton@hp.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=virtualization@lists.linux-foundation.org \ --cc=xen-devel@lists.xenproject.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).