From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Michael Ellerman <mpe@ellerman.id.au>,
linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
Boqun Feng <boqun.feng@gmail.com>,
Anton Blanchard <anton@samba.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation
Date: Thu, 8 Oct 2015 15:17:16 -0700 [thread overview]
Message-ID: <20151008221716.GF3910@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151008125937.GH16807@arm.com>
On Thu, Oct 08, 2015 at 01:59:38PM +0100, Will Deacon wrote:
> On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:
> > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:
> >
> > > > Currently, we do need smp_mb__after_unlock_lock() to be after the
> > > > acquisition on PPC -- putting it between the unlock and the lock
> > > > of course doesn't cut it for the cross-thread unlock/lock case.
> >
> > This ^, that makes me think I don't understand
> > smp_mb__after_unlock_lock.
> >
> > How is:
> >
> > UNLOCK x
> > smp_mb__after_unlock_lock()
> > LOCK y
> >
> > a problem? That's still a full barrier.
>
> I thought Paul was talking about something like this case:
>
> CPU A CPU B CPU C
> foo = 1
> UNLOCK x
> LOCK x
> (RELEASE) bar = 1
> ACQUIRE bar = 1
> READ_ONCE foo = 0
More like this:
CPU A CPU B CPU C
WRITE_ONCE(foo, 1);
UNLOCK x
LOCK x
r1 = READ_ONCE(bar);
WRITE_ONCE(bar, 1);
smp_mb();
r2 = READ_ONCE(foo);
This can result in r1==0 && r2==0.
> but this looks the same as ISA2+lwsyncs/ISA2+lwsync+ctrlisync+lwsync,
> which are both forbidden on PPC, so now I'm also confused.
>
> The different-lock, same thread case is more straight-forward, I think.
Indeed it is:
CPU A CPU B
WRITE_ONCE(foo, 1);
UNLOCK x
LOCK x
r1 = READ_ONCE(bar);
WRITE_ONCE(bar, 1);
smp_mb();
r2 = READ_ONCE(foo);
This also can result in r1==0 && r2==0.
> > > > I am with Peter -- we do need the benchmark results for PPC.
> > >
> > > Urgh, sorry guys. I have been slowly doing some benchmarks, but time is not
> > > plentiful at the moment.
> > >
> > > If we do a straight lwsync -> sync conversion for unlock it looks like that
> > > will cost us ~4.2% on Anton's standard context switch benchmark.
>
> Thanks Michael!
>
> > And that does not seem to agree with Paul's smp_mb__after_unlock_lock()
> > usage and would not be sufficient for the same (as of yet unexplained)
> > reason.
> >
> > Why does it matter which of the LOCK or UNLOCK gets promoted to full
> > barrier on PPC in order to become RCsc?
>
> I think we need a PPC litmus test illustrating the inter-thread, same
> lock failure case when smp_mb__after_unlock_lock is not present so that
> we can reason about this properly. Paul?
Please see above. ;-)
The corresponding litmus tests are below.
Thanx, Paul
------------------------------------------------------------------------
PPC lock-2thread-WR-barrier.litmus
""
(*
* Does 3.0 Linux-kernel Power lock-unlock provide local
* barrier that orders prior stores against subsequent loads,
* if the unlock and lock happen on different threads?
* This version uses lwsync instead of isync.
*)
(* 23-July-2013: ppcmem says "Sometimes" *)
{
l=1;
0:r1=1; 0:r4=x; 0:r10=0; 0:r12=l;
1:r1=1; 1:r3=42; 1:r4=x; 1:r5=y; 1:r10=0; 1:r11=0; 1:r12=l;
2:r1=1; 2:r4=x; 2:r5=y;
}
P0 | P1 | P2;
stw r1,0(r4) | lwarx r11,r10,r12 | stw r1,0(r5) ;
lwsync | cmpwi r11,0 | lwsync ;
stw r10,0(r12) | bne Fail1 | lwz r7,0(r4) ;
| stwcx. r1,r10,r12 | ;
| bne Fail1 | ;
| isync | ;
| lwz r3,0(r5) | ;
| Fail1: | ;
exists
(1:r3=0 /\ 2:r7=0)
------------------------------------------------------------------------
PPC lock-1thread-WR-barrier.litmus
""
(*
* Does 3.0 Linux-kernel Power lock-unlock provide local
* barrier that orders prior stores against subsequent loads,
* if the unlock and lock happen in the same thread?
* This version uses lwsync instead of isync.
*)
(* 8-Oct-2015: ppcmem says "Sometimes" *)
{
l=1;
0:r1=1; 0:r3=42; 0:r4=x; 0:r5=y; 0:r10=0; 0:r11=0; 0:r12=l;
1:r1=1; 1:r4=x; 1:r5=y;
}
P0 | P1 ;
stw r1,0(r4) | stw r1,0(r5) ;
lwsync | lwsync ;
stw r10,0(r12) | lwz r7,0(r4) ;
lwarx r11,r10,r12 | ;
cmpwi r11,0 | ;
bne Fail1 | ;
stwcx. r1,r10,r12 | ;
bne Fail1 | ;
isync | ;
lwz r3,0(r5) | ;
Fail1: | ;
exists
(0:r3=0 /\ 1:r7=0)
next prev parent reply other threads:[~2015-10-08 22:17 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-07 10:59 [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation Will Deacon
2015-10-07 11:19 ` Peter Zijlstra
2015-10-07 13:23 ` Will Deacon
2015-10-07 14:53 ` Peter Zijlstra
2015-10-07 15:25 ` Paul E. McKenney
2015-10-08 3:50 ` Michael Ellerman
2015-10-08 11:16 ` Peter Zijlstra
2015-10-08 12:59 ` Will Deacon
2015-10-08 22:17 ` Paul E. McKenney [this message]
2015-10-09 9:51 ` Will Deacon
2015-10-09 11:25 ` Peter Zijlstra
2015-10-09 17:44 ` Paul E. McKenney
2015-10-09 17:44 ` Paul E. McKenney
2015-10-09 17:43 ` Paul E. McKenney
2015-10-09 18:33 ` Will Deacon
2015-10-12 23:30 ` Paul E. McKenney
2015-10-20 14:20 ` Boqun Feng
2015-10-08 21:44 ` Paul E. McKenney
2015-10-09 7:29 ` Peter Zijlstra
2015-10-09 8:31 ` Peter Zijlstra
2015-10-09 9:40 ` Will Deacon
2015-10-09 11:02 ` Peter Zijlstra
2015-10-09 12:41 ` Will Deacon
2015-10-09 11:12 ` Peter Zijlstra
2015-10-09 12:51 ` Will Deacon
2015-10-09 13:06 ` Peter Zijlstra
2015-10-09 11:13 ` Peter Zijlstra
2015-10-09 17:21 ` Paul E. McKenney
2015-10-19 1:17 ` Boqun Feng
2015-10-19 10:23 ` Peter Zijlstra
2015-10-20 7:35 ` Boqun Feng
2015-10-20 23:34 ` Paul E. McKenney
2015-10-21 8:24 ` Peter Zijlstra
2015-10-21 19:29 ` Paul E. McKenney
2015-10-21 19:36 ` Peter Zijlstra
2015-10-21 19:56 ` Paul E. McKenney
2015-10-21 16:04 ` David Laight
2015-10-21 16:04 ` David Laight
2015-10-21 16:04 ` David Laight
2015-10-21 19:34 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151008221716.GF3910@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=boqun.feng@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.