linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman.Long@hp.com, linux-arch@vger.kernel.org, riel@redhat.com,
	gleb@redhat.com, kvm@vger.kernel.org, boris.ostrovsky@oracle.com,
	scott.norton@hp.com, raghavendra.kt@linux.vnet.ibm.com,
	paolo.bonzini@gmail.com, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, chegu_vinod@hp.com,
	david.vrabel@citrix.com, oleg@redhat.com,
	xen-devel@lists.xenproject.org, tglx@linutronix.de,
	paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org,
	mingo@kernel.org
Subject: Re: [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock
Date: Mon, 23 Jun 2014 12:20:20 -0400	[thread overview]
Message-ID: <20140623162020.GA9426@laptop.dumpdata.com> (raw)
In-Reply-To: <20140623161200.GG19860@laptop.programming.kicks-ass.net>

On Mon, Jun 23, 2014 at 06:12:00PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 17, 2014 at 04:03:29PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > +			new = tail | (val & _Q_LOCKED_MASK);
> > > > +
> > > > +		old = atomic_cmpxchg(&lock->val, val, new);
> > > > +		if (old == val)
> > > > +			break;
> > > > +
> > > > +		val = old;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * we won the trylock; forget about queueing.
> > > > +	 */
> > > > +	if (new == _Q_LOCKED_VAL)
> > > > +		goto release;
> > > > +
> > > > +	/*
> > > > +	 * if there was a previous node; link it and wait.
> > > > +	 */
> > > > +	if (old & ~_Q_LOCKED_MASK) {
> > > > +		prev = decode_tail(old);
> > > > +		ACCESS_ONCE(prev->next) = node;
> > > > +
> > > > +		arch_mcs_spin_lock_contended(&node->locked);
> > 
> > Could you add a comment here:
> > 
> > /* We are spinning forever until the previous node updates locked - which
> > it does once the it has updated lock->val with our tail number. */
> 
> That's incorrect -- or at least, I understand that to be incorrect. The
> previous node will not have changed the tail to point to us. You always
> change to tail to point to yourself, seeing how you add yourself to the
> tail.
> 
> Is the existing comment any better if I s/wait./wait for it to release
> us./ ?

Yes!
> 
> > > > +	/*
> > > > +	 * claim the lock:
> > > > +	 *
> > > > +	 * n,0 -> 0,1 : lock, uncontended
> > > > +	 * *,0 -> *,1 : lock, contended
> > > > +	 */
> > > > +	for (;;) {
> > > > +		new = _Q_LOCKED_VAL;
> > > > +		if (val != tail)
> > > > +			new |= val;
> > > 
> > ..snip..
> > > 
> > > Could you help a bit in explaining it in English please?
> > 
> > After looking at the assembler code I finally figured out how
> > we can get here. And the 'contended' part threw me off. Somehow
> > I imagined there are two more more CPUs stampeding here and 
> > trying to update the lock->val. But in reality the other CPUs
> > are stuck in the arch_mcs_spin_lock_contended spinning on their
> > local value.
> 
> Well, the lock as a whole is contended (there's >1 waiters), and the
> point of MCS style locks it to make sure they're not actually pounding
> on the same cacheline. So the whole thing is consistent.
> 
> > Perhaps you could add this comment.
> > 
> > /* Once queue_spin_unlock is called (which _subtracts_ _Q_LOCKED_VAL from
> > the lock->val and still preserving the tail data), the winner gets to
> > claim the ticket. 
> 
> There's no tickets :/

s/ticket/be first in line/ ?

> 
> > Since we still need the other CPUs to continue and
> > preserve the strict ordering in which they setup node->next, we:
> >  1) update lock->val to the tail value (so tail CPU and its index) with
> >     _Q_LOCKED_VAL.
> 
> We don't, we preserve the tail value, unless we're the tail, in which
> case we clear the tail.
> 
> >  2). Once we are done, we poke the other CPU (the one that linked to
> >     us) by writting to node->locked (below) so they can make progress and
> >     loop on lock->val changing from _Q_LOCKED_MASK to zero).
> 
> _If_ there was another cpu, ie. the tail didn't point to us.

<nods>
> 
> ---
> 
> I don't do well with natural language comments like that; they tend to
> confuse me more than anything.
> 

WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman.Long@hp.com, tglx@linutronix.de, mingo@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xenproject.org, kvm@vger.kernel.org,
	paolo.bonzini@gmail.com, boris.ostrovsky@oracle.com,
	paulmck@linux.vnet.ibm.com, riel@redhat.com,
	torvalds@linux-foundation.org, raghavendra.kt@linux.vnet.ibm.com,
	david.vrabel@citrix.com, oleg@redhat.com, gleb@redhat.com,
	scott.norton@hp.com, chegu_vinod@hp.com
Subject: Re: [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock
Date: Mon, 23 Jun 2014 12:20:20 -0400	[thread overview]
Message-ID: <20140623162020.GA9426@laptop.dumpdata.com> (raw)
Message-ID: <20140623162020.onWWl-sOAWZ80ewfFP6wlWH3eklZErl2OpkGY_pSSFo@z> (raw)
In-Reply-To: <20140623161200.GG19860@laptop.programming.kicks-ass.net>

On Mon, Jun 23, 2014 at 06:12:00PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 17, 2014 at 04:03:29PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > +			new = tail | (val & _Q_LOCKED_MASK);
> > > > +
> > > > +		old = atomic_cmpxchg(&lock->val, val, new);
> > > > +		if (old == val)
> > > > +			break;
> > > > +
> > > > +		val = old;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * we won the trylock; forget about queueing.
> > > > +	 */
> > > > +	if (new == _Q_LOCKED_VAL)
> > > > +		goto release;
> > > > +
> > > > +	/*
> > > > +	 * if there was a previous node; link it and wait.
> > > > +	 */
> > > > +	if (old & ~_Q_LOCKED_MASK) {
> > > > +		prev = decode_tail(old);
> > > > +		ACCESS_ONCE(prev->next) = node;
> > > > +
> > > > +		arch_mcs_spin_lock_contended(&node->locked);
> > 
> > Could you add a comment here:
> > 
> > /* We are spinning forever until the previous node updates locked - which
> > it does once the it has updated lock->val with our tail number. */
> 
> That's incorrect -- or at least, I understand that to be incorrect. The
> previous node will not have changed the tail to point to us. You always
> change to tail to point to yourself, seeing how you add yourself to the
> tail.
> 
> Is the existing comment any better if I s/wait./wait for it to release
> us./ ?

Yes!
> 
> > > > +	/*
> > > > +	 * claim the lock:
> > > > +	 *
> > > > +	 * n,0 -> 0,1 : lock, uncontended
> > > > +	 * *,0 -> *,1 : lock, contended
> > > > +	 */
> > > > +	for (;;) {
> > > > +		new = _Q_LOCKED_VAL;
> > > > +		if (val != tail)
> > > > +			new |= val;
> > > 
> > ..snip..
> > > 
> > > Could you help a bit in explaining it in English please?
> > 
> > After looking at the assembler code I finally figured out how
> > we can get here. And the 'contended' part threw me off. Somehow
> > I imagined there are two more more CPUs stampeding here and 
> > trying to update the lock->val. But in reality the other CPUs
> > are stuck in the arch_mcs_spin_lock_contended spinning on their
> > local value.
> 
> Well, the lock as a whole is contended (there's >1 waiters), and the
> point of MCS style locks it to make sure they're not actually pounding
> on the same cacheline. So the whole thing is consistent.
> 
> > Perhaps you could add this comment.
> > 
> > /* Once queue_spin_unlock is called (which _subtracts_ _Q_LOCKED_VAL from
> > the lock->val and still preserving the tail data), the winner gets to
> > claim the ticket. 
> 
> There's no tickets :/

s/ticket/be first in line/ ?

> 
> > Since we still need the other CPUs to continue and
> > preserve the strict ordering in which they setup node->next, we:
> >  1) update lock->val to the tail value (so tail CPU and its index) with
> >     _Q_LOCKED_VAL.
> 
> We don't, we preserve the tail value, unless we're the tail, in which
> case we clear the tail.
> 
> >  2). Once we are done, we poke the other CPU (the one that linked to
> >     us) by writting to node->locked (below) so they can make progress and
> >     loop on lock->val changing from _Q_LOCKED_MASK to zero).
> 
> _If_ there was another cpu, ie. the tail didn't point to us.

<nods>
> 
> ---
> 
> I don't do well with natural language comments like that; they tend to
> confuse me more than anything.
> 

  parent reply	other threads:[~2014-06-23 16:20 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-15 12:46 [PATCH 00/11] qspinlock with paravirt support Peter Zijlstra
2014-06-15 12:46 ` Peter Zijlstra
2014-06-15 12:46 ` [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock Peter Zijlstra
2014-06-15 12:46   ` Peter Zijlstra
2014-06-16 20:49   ` Konrad Rzeszutek Wilk
2014-06-16 20:49     ` Konrad Rzeszutek Wilk
2014-06-17 20:03     ` Konrad Rzeszutek Wilk
2014-06-17 20:03       ` Konrad Rzeszutek Wilk
2014-06-23 16:12       ` Peter Zijlstra
2014-06-23 16:12         ` Peter Zijlstra
2014-06-23 16:20         ` Konrad Rzeszutek Wilk [this message]
2014-06-23 16:20           ` Konrad Rzeszutek Wilk
2014-06-23 15:56     ` Peter Zijlstra
2014-06-23 16:16       ` Konrad Rzeszutek Wilk
2014-06-23 16:16         ` Konrad Rzeszutek Wilk
2014-06-17 20:05   ` Konrad Rzeszutek Wilk
2014-06-17 20:05     ` Konrad Rzeszutek Wilk
2014-06-23 16:26     ` Peter Zijlstra
2014-06-23 16:26       ` Peter Zijlstra
2014-06-23 16:45       ` Konrad Rzeszutek Wilk
2014-06-23 16:45         ` Konrad Rzeszutek Wilk
2014-06-15 12:46 ` [PATCH 02/11] qspinlock, x86: Enable x86-64 to use " Peter Zijlstra
2014-06-15 12:46   ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 03/11] qspinlock: Add pending bit Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-17 20:36   ` Konrad Rzeszutek Wilk
2014-06-17 20:36     ` Konrad Rzeszutek Wilk
2014-06-17 20:51     ` Waiman Long
2014-06-17 20:51       ` Waiman Long
2014-06-17 21:07       ` Konrad Rzeszutek Wilk
2014-06-17 21:07         ` Konrad Rzeszutek Wilk
2014-06-17 21:10         ` Konrad Rzeszutek Wilk
2014-06-17 21:10           ` Konrad Rzeszutek Wilk
2014-06-17 22:25           ` Waiman Long
2014-06-17 22:25             ` Waiman Long
2014-06-24  8:24         ` Peter Zijlstra
2014-06-24  8:24           ` Peter Zijlstra
2014-06-18 11:29     ` Paolo Bonzini
2014-06-18 11:29       ` Paolo Bonzini
2014-06-18 13:36       ` Konrad Rzeszutek Wilk
2014-06-18 13:36         ` Konrad Rzeszutek Wilk
2014-06-23 16:35     ` Peter Zijlstra
2014-06-23 16:35       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 04/11] qspinlock: Extract out the exchange of tail code word Peter Zijlstra
2014-06-17 20:55   ` Konrad Rzeszutek Wilk
2014-06-17 20:55     ` Konrad Rzeszutek Wilk
2014-06-18 11:37     ` Paolo Bonzini
2014-06-18 11:37       ` Paolo Bonzini
2014-06-18 13:50       ` Konrad Rzeszutek Wilk
2014-06-18 13:50         ` Konrad Rzeszutek Wilk
2014-06-18 15:46         ` Waiman Long
2014-06-18 15:46           ` Waiman Long
2014-06-18 15:49           ` Paolo Bonzini
2014-06-18 15:49             ` Paolo Bonzini
2014-06-18 16:02           ` Konrad Rzeszutek Wilk
2014-06-18 16:02             ` Konrad Rzeszutek Wilk
2014-06-24 10:47       ` Peter Zijlstra
2014-06-24 10:47         ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 05/11] qspinlock: Optimize for smaller NR_CPUS Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 11:39   ` Paolo Bonzini
2014-06-18 11:39     ` Paolo Bonzini
2014-07-07 14:35     ` Peter Zijlstra
2014-07-07 14:35       ` Peter Zijlstra
2014-07-07 15:08       ` Paolo Bonzini
2014-07-07 15:08         ` Paolo Bonzini
2014-07-07 15:35         ` Peter Zijlstra
2014-07-07 15:35           ` Peter Zijlstra
2014-07-07 16:10           ` Paolo Bonzini
2014-07-07 16:10             ` Paolo Bonzini
2014-06-18 15:57   ` Konrad Rzeszutek Wilk
2014-06-18 15:57     ` Konrad Rzeszutek Wilk
2014-07-07 14:33     ` Peter Zijlstra
2014-07-07 14:33       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 06/11] qspinlock: Optimize pending bit Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 11:42   ` Paolo Bonzini
2014-06-18 11:42     ` Paolo Bonzini
2014-06-15 12:47 ` [PATCH 07/11] qspinlock: Use a simple write to grab the lock, if applicable Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 16:36   ` Konrad Rzeszutek Wilk
2014-06-18 16:36     ` Konrad Rzeszutek Wilk
2014-07-07 14:51     ` Peter Zijlstra
2014-07-07 14:51       ` Peter Zijlstra
2014-06-15 12:47 ` [PATCH 08/11] qspinlock: Revert to test-and-set on hypervisors Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-16 21:57   ` Waiman Long
2014-06-18 16:40   ` Konrad Rzeszutek Wilk
2014-06-18 16:40     ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 09/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-18 16:43   ` Konrad Rzeszutek Wilk
2014-06-18 16:43     ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 10/11] qspinlock: Paravirt support Peter Zijlstra
2014-06-15 12:47   ` Peter Zijlstra
2014-06-16 22:08   ` Waiman Long
2014-06-18 12:03     ` Paolo Bonzini
2014-06-18 12:03       ` Paolo Bonzini
2014-06-18 15:26       ` Waiman Long
2014-06-18 15:26         ` Waiman Long
2014-07-07 15:20       ` Peter Zijlstra
2014-07-07 15:20         ` Peter Zijlstra
2014-07-07 15:20     ` Peter Zijlstra
2014-07-07 15:20       ` Peter Zijlstra
2014-06-17  0:53   ` Waiman Long
2014-06-17  0:53     ` Waiman Long
2014-06-18 12:04   ` Paolo Bonzini
2014-06-18 12:04     ` Paolo Bonzini
2014-06-20 13:46   ` Konrad Rzeszutek Wilk
2014-06-20 13:46     ` Konrad Rzeszutek Wilk
2014-07-07 15:27     ` Peter Zijlstra
2014-07-15 14:23       ` Konrad Rzeszutek Wilk
2014-07-15 14:23         ` Konrad Rzeszutek Wilk
2014-06-15 12:47 ` [PATCH 11/11] qspinlock, kvm: Add paravirt support Peter Zijlstra
2014-06-22 16:36   ` Raghavendra K T
2014-06-22 16:36     ` Raghavendra K T
2014-07-07 15:23     ` Peter Zijlstra
2014-07-07 15:23       ` Peter Zijlstra
2014-06-16 20:52 ` [PATCH 00/11] qspinlock with " Konrad Rzeszutek Wilk
2014-06-16 20:52   ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140623162020.GA9426@laptop.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Waiman.Long@hp.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=chegu_vinod@hp.com \
    --cc=david.vrabel@citrix.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paolo.bonzini@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).