Re: [RFC][PATCH 3/3] locking/qspinlock: Optimize for x86

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Parri <andrea.parri@amarulasolutions.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	mingo@kernel.org, linux-kernel@vger.kernel.org,
	longman@redhat.com, tglx@linutronix.de
Subject: Re: [RFC][PATCH 3/3] locking/qspinlock: Optimize for x86
Date: Tue, 2 Oct 2018 15:44:42 +0200	[thread overview]
Message-ID: <20181002134442.GA10754@andrea> (raw)
In-Reply-To: <20181002132208.GF16422@arm.com>

On Tue, Oct 02, 2018 at 02:22:09PM +0100, Will Deacon wrote:
> On Tue, Oct 02, 2018 at 02:31:52PM +0200, Andrea Parri wrote:
> > > consider this scenario with your patch:
> > > 
> > > 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
> > >    pending.
> > > 
> > > 2. CPU1 comes in and sets pending, spins on locked
> > > 
> > > 3. CPU2 sees a pending and locked val, and is about to enter the head of
> > >    the waitqueue (i.e. it's right before xchg_tail()).
> > > 
> > > 4. The locked holder unlock()s, CPU1 takes the lock() and then unlock()s
> > >    it, so pending and locked are now 0.
> > > 
> > > 5. CPU0 sets pending and reads back zeroes for the other fields
> > > 
> > > 6. CPU0 clears pending and sets locked -- it now has the lock
> > > 
> > > 7. CPU2 updates tail, sees it's at the head of the waitqueue and spins
> > >    for locked and pending to go clear. However, it reads a stale value
> > >    from step (4) and attempts the atomic_try_cmpxchg() to take the lock.
> > > 
> > > 8. CPU2 will fail the cmpxchg(), but then go ahead and set locked. At this
> > >    point we're hosed, because both CPU2 and CPU0 have the lock.
> > 
> > Thanks for pointing this out.   I am wondering: can't we have a similar
> > scenario with the current code (i.e., w/o these patches): what prevents
> > the scenario reported below, following Peter's diagram, from happening?
> 
> The xchg_tail() in step (7) reads from the fetch_or_acquire() in step (5),
> so I don't think we can see a stale value in the subsequent (overlapping)
> acquire load.

I see, thanks for the clarification.

  Andrea


> 
> Will
> 
> >   CPU0		CPU1		CPU2		CPU3
> > 
> > 0)						lock
> > 						  trylock -> (0,0,1)
> > 1)lock
> >     trylock /* fail */
> > 
> > 2)		lock
> > 		  trylock /* fail */
> > 		  fetch_or_acquire -> (0,1,1)
> > 		  wait-locked
> > 
> > 3)				lock
> > 				  trylock /* fail */
> > 				  goto queue
> > 
> > 4)						unlock -> (0,1,0)
> > 		  clr_pnd_set_lck -> (0,0,1)
> > 		  unlock -> (0,0,0)
> > 
> > 5)  fetch_or_acquire -> (0,1,0)
> > 6)  clr_pnd_set_lck -> (0,0,1)
> > 7)				  xchg_tail -> (n,0,1)
> > 				  load_acquire <- (n,0,0) (from-4)
> > 8)				  cmpxchg /* fail */
> > 				  set_locked()

next prev parent reply	other threads:[~2018-10-02 13:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26 11:01 [RFC][PATCH 0/3] locking/qspinlock: Improve determinism for x86 Peter Zijlstra
2018-09-26 11:01 ` [RFC][PATCH 1/3] locking/qspinlock: Re-order code Peter Zijlstra
2018-10-01 17:17   ` Will Deacon
2018-09-26 11:01 ` [RFC][PATCH 2/3] locking/qspinlock: Rework some comments Peter Zijlstra
2018-10-01 17:17   ` Will Deacon
2018-10-01 19:10     ` Peter Zijlstra
2018-10-02 13:20       ` Will Deacon
2018-10-02 13:43         ` Peter Zijlstra
2018-09-26 11:01 ` [RFC][PATCH 3/3] locking/qspinlock: Optimize for x86 Peter Zijlstra
2018-09-26 16:30   ` Waiman Long
2018-09-26 17:54     ` Peter Zijlstra
2018-09-27  7:29       ` Peter Zijlstra
2018-09-26 20:52   ` Andrea Parri
2018-09-27  7:17     ` Peter Zijlstra
2018-09-27  7:47       ` Andrea Parri
2018-09-27  7:59         ` Peter Zijlstra
2018-09-27  8:13           ` Andrea Parri
2018-09-27  8:57             ` Peter Zijlstra
2018-09-27 12:16   ` David Laight
2018-10-01 17:17   ` Will Deacon
2018-10-01 20:00     ` Peter Zijlstra
2018-10-02 13:19       ` Will Deacon
2018-10-02 14:14         ` Peter Zijlstra
2018-10-02 12:31     ` Andrea Parri
2018-10-02 13:22       ` Will Deacon
2018-10-02 13:44         ` Andrea Parri [this message]
2018-09-26 15:01 ` [RFC][PATCH 0/3] locking/qspinlock: Improve determinism " Sebastian Andrzej Siewior
2018-09-26 15:08   ` Thomas Gleixner
2018-09-26 15:38     ` Sebastian Andrzej Siewior
2018-09-26 16:20 ` Waiman Long
2018-09-26 17:51   ` Peter Zijlstra
2018-09-26 23:21     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181002134442.GA10754@andrea \
    --to=andrea.parri@amarulasolutions.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.