From: Andrea Parri <andrea.parri@amarulasolutions.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>,
Boqun Feng <boqun.feng@gmail.com>,
David Howells <dhowells@redhat.com>,
Jade Alglave <j.alglave@ucl.ac.uk>,
Luc Maranget <luc.maranget@inria.fr>,
Nicholas Piggin <npiggin@gmail.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Kernel development list <linux-kernel@vger.kernel.org>,
dlustig@nvidia.com
Subject: Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks
Date: Thu, 5 Jul 2018 16:00:29 +0200 [thread overview]
Message-ID: <20180705140029.GA5346@andrea> (raw)
In-Reply-To: <20180704121103.GB26941@arm.com>
On Wed, Jul 04, 2018 at 01:11:04PM +0100, Will Deacon wrote:
> Hi Alan,
>
> On Tue, Jul 03, 2018 at 01:28:17PM -0400, Alan Stern wrote:
> > On Mon, 25 Jun 2018, Andrea Parri wrote:
> >
> > > On Fri, Jun 22, 2018 at 07:30:08PM +0100, Will Deacon wrote:
> > > > > > I think the second example would preclude us using LDAPR for load-acquire,
> > >
> > > > I don't think it's a moot point. We want new architectures to implement
> > > > acquire/release efficiently, and it's not unlikely that they will have
> > > > acquire loads that are similar in semantics to LDAPR. This patch prevents
> > > > them from doing so,
> > >
> > > By this same argument, you should not be a "big fan" of rfi-rel-acq in ppo ;)
> > > consider, e.g., the two litmus tests below: what am I missing?
> >
> > This is an excellent point, which seems to have gotten lost in the
> > shuffle. I'd like to see your comments.
>
> Yeah, sorry. Loads going on at the moment. You could ask herd instead of me
> though ;)
>
> > In essence, if you're using release-acquire instructions that only
> > provide RCpc consistency, does store-release followed by load-acquire
> > of the same address provide read-read ordering? In theory it doesn't
> > have to, because if the value from the store-release is forwarded to
> > the load-acquire then:
> >
> > LOAD A
> > STORE-RELEASE X, v
> > LOAD-ACQUIRE X
> > LOAD B
> >
> > could be executed by the CPU in the order:
> >
> > LOAD-ACQUIRE X
> > LOAD B
> > LOAD A
> > STORE-RELEASE X, v
> >
> > thereby accessing A and B out of program order without violating the
> > requirements on the release or the acquire.
> >
> > Of course PPC doesn't allow this, but should we rule it out entirely?
>
> This would be allowed if LOAD-ACQUIRE was implemented using LDAPR on Arm.
> I don't think we should be ruling out architectures using RCpc
> acquire/release primitives, because doing so just feels like an artifact of
> most architectures building these out of fences today.
>
> It's funny really, because from an Arm-perspective I don't plan to stray
> outside of RCsc, but I feel like other weak architectures aren't being
> well represented here. If we just care about x86, Arm and Power (and assume
> that Power doesn't plan to implement RCpc acquire/release instructions)
> then we're good to tighten things up. But I fear that RISC-V should probably
> be more engaged (adding Daniel) and who knows about MIPS or these other
> random architectures popping up on linux-arch.
>
> > > C MP+fencewmbonceonce+pooncerelease-rfireleaseacquire-poacquireonce
> > >
> > > {}
> > >
> > > P0(int *x, int *y)
> > > {
> > > WRITE_ONCE(*x, 1);
> > > smp_wmb();
> > > WRITE_ONCE(*y, 1);
> > > }
> > >
> > > P1(int *x, int *y, int *z)
> > > {
> > > r0 = READ_ONCE(*y);
> > > smp_store_release(z, 1);
> > > r1 = smp_load_acquire(z);
> > > r2 = READ_ONCE(*x);
> > > }
> > >
> > > exists (1:r0=1 /\ 1:r1=1 /\ 1:r2=0)
> > >
> > >
> > > AArch64 MP+dmb.st+popl-rfilq-poqp
> > > "DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre"
> > > Generator=diyone7 (version 7.49+02(dev))
> > > Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
> > > Com=Rf Fr
> > > Orig=DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre
> > > {
> > > 0:X1=x; 0:X3=y;
> > > 1:X1=y; 1:X3=z; 1:X6=x;
> > > }
> > > P0 | P1 ;
> > > MOV W0,#1 | LDR W0,[X1] ;
> > > STR W0,[X1] | MOV W2,#1 ;
> > > DMB ST | STLR W2,[X3] ;
> > > MOV W2,#1 | LDAPR W4,[X3] ;
> > > STR W2,[X3] | LDR W5,[X6] ;
> > > exists
> > > (1:X0=1 /\ 1:X4=1 /\ 1:X5=0)
>
> (you can also run this yourself, since 'Q' is supported in the .cat file
> I contributed to herdtools7)
>
> Test MP+dmb.sy+popl-rfilq-poqp Allowed
> States 4
> 1:X0=0; 1:X4=1; 1:X5=0;
> 1:X0=0; 1:X4=1; 1:X5=1;
> 1:X0=1; 1:X4=1; 1:X5=0;
> 1:X0=1; 1:X4=1; 1:X5=1;
> Ok
> Witnesses
> Positive: 1 Negative: 3
> Condition exists (1:X0=1 /\ 1:X4=1 /\ 1:X5=0)
> Observation MP+dmb.sy+popl-rfilq-poqp Sometimes 1 3
> Time MP+dmb.sy+popl-rfilq-poqp 0.01
> Hash=61858b7b59a6310d869f99cd05718f96
>
> > There's also read-write ordering, in the form of the LB pattern:
> >
> > P0(int *x, int *y, int *z)
> > {
> > r0 = READ_ONCE(*x);
> > smp_store_release(z, 1);
> > r1 = smp_load_acquire(z);
> > WRITE_ONCE(*y, 1);
> > }
> >
> > P1(int *x, int *y)
> > {
> > r2 = READ_ONCE(*y);
> > smp_mp();
> > WRITE_ONCE(*x, 1);
> > }
> >
> > exists (0:r0=1 /\ 1:r2=1)
>
> The access types are irrelevant to the acquire/release primitives, so yes
> that's also allowed.
>
> > Would this be allowed if smp_load_acquire() was implemented with LDAPR?
> > If the answer is yes then we will have to remove the rfi-rel-acq and
> > rel-rf-acq-po relations from the memory model entirely.
>
> I don't understand what you mean by "rfi-rel-acq-po", and I assume you mean
> rel-rfi-acq-po for the other? Sounds like I'm confused here.
[Your reply about 1/2 suggests that you've figured this out now, IAC ...]
"rfi-rel-acq" (as Alan wrote, and as I wrote before my question above...)
is defined and currently used in linux-kernel.cat (and, FWIW, it has been
so since when we upstreamed LKMM).
My point is that, as exemplified by the two tests I reported above, this
relation already prevents you from implementing your acquire with LDAPR;
so my/our question was not "can you run herd7 for me?" but rather "do you
think that changes are needed to the .cat file?".
This question goes back _at least_ to:
http://lkml.kernel.org/r/1519301990-11766-1-git-send-email-parri.andrea@gmail.com
(see, in part., the "IMPORTANT" note at the bottom of the commit message)
and that discussion later resulted in:
0123f4d76ca63b ("riscv/spinlock: Strengthen implementations with fences")
5ce6c1f3535fa8 ("riscv/atomic: Strengthen implementations with fences")
(the latest _draft_ of the RISC-V specification, as pointed out by Daniel,
https://github.com/riscv/riscv-isa-manual, Appendix A.5
includes our "Linux mapping", although the currently-recommended mapping
differs and involves a "fence.tso [+ any acquire, including RCpc .aq]").
My understanding is that your answer to this question is "Yes", but I am
not sure about the "How/Which changes?"; of course, an answer his question
_in the form_ of PATCHes would be appreciated! (but please also consider
that I'll be offline for most of the time until next Monday.)
Andrea
>
> Will
next prev parent reply other threads:[~2018-07-05 14:00 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-21 17:27 [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks Alan Stern
2018-06-21 18:04 ` Peter Zijlstra
2018-06-22 3:34 ` Paul E. McKenney
2018-06-22 8:08 ` Peter Zijlstra
2018-06-22 8:09 ` Will Deacon
2018-06-22 9:55 ` Will Deacon
2018-06-22 10:31 ` Peter Zijlstra
2018-06-22 10:38 ` Will Deacon
2018-06-22 11:25 ` Andrea Parri
2018-06-22 16:40 ` Paul E. McKenney
2018-06-22 18:09 ` Alan Stern
2018-06-22 18:30 ` Will Deacon
2018-06-22 19:11 ` Alan Stern
2018-06-22 20:53 ` Paul E. McKenney
2018-07-04 11:53 ` Will Deacon
2018-06-25 8:19 ` Andrea Parri
2018-07-03 17:28 ` Alan Stern
2018-07-04 11:28 ` Paul E. McKenney
2018-07-04 12:13 ` Will Deacon
2018-07-05 14:23 ` Alan Stern
2018-07-05 15:31 ` Paul E. McKenney
2018-07-04 12:11 ` Will Deacon
2018-07-05 14:00 ` Andrea Parri [this message]
2018-07-05 14:44 ` Will Deacon
2018-07-05 15:16 ` Daniel Lustig
2018-07-05 15:35 ` Daniel Lustig
2018-07-05 14:21 ` Alan Stern
2018-07-05 14:46 ` Will Deacon
2018-07-05 14:57 ` Alan Stern
2018-07-05 15:15 ` Will Deacon
2018-07-05 15:09 ` Andrea Parri
2018-07-06 20:37 ` Alan Stern
2018-07-06 21:10 ` Paul E. McKenney
2018-07-09 16:52 ` Will Deacon
2018-07-09 17:29 ` Daniel Lustig
2018-07-09 19:18 ` Alan Stern
2018-07-05 15:31 ` Paul E. McKenney
2018-07-05 15:39 ` Andrea Parri
2018-07-05 16:58 ` Paul E. McKenney
2018-07-05 17:06 ` Andrea Parri
2018-07-05 15:44 ` Daniel Lustig
2018-07-05 16:22 ` Will Deacon
2018-07-05 16:56 ` Paul E. McKenney
2018-07-05 18:12 ` Daniel Lustig
2018-07-05 18:38 ` Andrea Parri
2018-07-05 18:44 ` Andrea Parri
2018-07-05 23:32 ` Paul E. McKenney
2018-07-05 23:31 ` Paul E. McKenney
2018-07-06 9:25 ` Will Deacon
2018-07-06 14:14 ` Paul E. McKenney
2018-06-25 7:32 ` Peter Zijlstra
2018-06-25 8:29 ` Andrea Parri
2018-06-25 9:06 ` Peter Zijlstra
2018-06-22 9:06 ` Andrea Parri
2018-06-22 19:23 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180705140029.GA5346@andrea \
--to=andrea.parri@amarulasolutions.com \
--cc=akiyks@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=dhowells@redhat.com \
--cc=dlustig@nvidia.com \
--cc=j.alglave@ucl.ac.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=luc.maranget@inria.fr \
--cc=npiggin@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=stern@rowland.harvard.edu \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.