All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Dan Lustig <dlustig@nvidia.com>
Cc: Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Anvin <hpa@zytor.com>,
	Andrea Parri <parri.andrea@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Vince Weaver <vincent.weaver@maine.edu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jiri Olsa <jolsa@redhat.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Stephane Eranian <eranian@google.com>,
	linux-tip-commits@vger.kernel.org, palmer@dabbelt.com,
	paul.walmsley@sifive.com, mpe@ellerman.id.au
Subject: Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire
Date: Fri, 10 Sep 2021 13:37:48 +0800	[thread overview]
Message-ID: <YTrvLHB6lpol79ka@boqun-archlinux> (raw)
In-Reply-To: <YTqgSmX57l2hCMk0@boqun-archlinux>

On Fri, Sep 10, 2021 at 08:01:14AM +0800, Boqun Feng wrote:
> On Thu, Sep 09, 2021 at 01:03:18PM -0400, Dan Lustig wrote:
> > On 9/9/2021 9:35 AM, Will Deacon wrote:
> > > [+Palmer, PaulW, Daniel and Michael]
> > > 
> > > On Thu, Sep 09, 2021 at 09:25:30AM +0200, Peter Zijlstra wrote:
> > >> On Wed, Sep 08, 2021 at 09:08:33AM -0700, Linus Torvalds wrote:
> > >>
> > >>> So if this is purely a RISC-V thing,
> > >>
> > >> Just to clarify, I think the current RISC-V thing is stonger than
> > >> PowerPC, but maybe not as strong as say ARM64, but RISC-V memory
> > >> ordering is still somewhat hazy to me.
> > >>
> > >> Specifically, the sequence:
> > >>
> > >> 	/* critical section s */
> > >> 	WRITE_ONCE(x, 1);
> > >> 	FENCE RW, W
> > >> 	WRITE_ONCE(s.lock, 0);		/* store S */
> > >> 	AMOSWAP %0, 1, r.lock		/* store R */
> > >> 	FENCE R, RW
> > >> 	WRITE_ONCE(y, 1);
> > >> 	/* critical section r */
> > >>
> > >> fully separates section s from section r, as in RW->RW ordering
> > >> (possibly not as strong as smp_mb() though), while on PowerPC it would
> > >> only impose TSO ordering between sections.
> > >>
> > >> The AMOSWAP is a RmW and as such matches the W from the RW->W fence,
> > >> similarly it marches the R from the R->RW fence, yielding an:
> > >>
> > >> 	RW->  W
> > >> 	    RmW
> > >> 	    R  ->RW
> > >>
> > >> ordering. It's the stores S and R that can be re-ordered, but not the
> > >> sections themselves (same on PowerPC and many others).
> > >>
> > >> Clarification from a RISC-V enabled person would be appreciated.
> > 
> > To first order, RISC-V's memory model is very similar to ARMv8's.  It
> > is "other-multi-copy-atomic", unlike Power, and respects dependencies.
> > It also has AMOs and LR/SC with optional RCsc acquire or release
> > semantics.  There's no need to worry about RISC-V somehow pushing the
> > boundaries of weak memory ordering in new ways.
> > 
> > The tricky part is that unlike ARMv8, RISC-V doesn't have load-acquire
> > or store-release opcodes at all.  Only AMOs and LR/SC have acquire or
> > release options.  That means that while certain operations like swap
> > can be implemented with native RCsc semantics, others like store-release
> > have to fall back on fences and plain writes.
> > 
> > That's where the complexity came up last time this was discussed, at
> > least as it relates to RISC-V: how to make sure the combination of RCsc
> > atomics and plain operations+fences gives the semantics everyone is
> > asking for here.  And to be clear there, I'm not asking for LKMM to
> > weaken anything about critical section ordering just for RISC-V's sake.
> > TSO/RCsc ordering between critical sections is a perfectly reasonable
> > model in my opinion.  I just want to make sure RISC-V gets it right
> > given whatever the decision is.
> > 
> > >>> then I think it's entirely reasonable to
> > >>>
> > >>>         spin_unlock(&r);
> > >>>         spin_lock(&s);
> > >>>
> > >>> cannot be reordered.
> > >>
> > >> I'm obviously completely in favour of that :-)
> > > 
> > > I don't think we should require the accesses to the actual lockwords to
> > > be ordered here, as it becomes pretty onerous for relaxed LL/SC
> > > architectures where you'd end up with an extra barrier either after the
> > > unlock() or before the lock() operation. However, I remain absolutely in
> > > favour of strengthening the ordering of the _critical sections_ guarded by
> > > the locks to be RCsc.
> > 
> > I agree with Will here.  If the AMOSWAP above is actually implemented with
> > a RISC-V AMO, then the two critical sections will be separated as if RW,RW,
> > as Peter described.  If instead it's implemented using LR/SC, then RISC-V
> 
> Just out of curiosity, in the following code, can the store S and load L
> be reordered?
> 
> 	WRITE_ONCE(x, 1); // store S
> 	FENCE RW, W
>  	WRITE_ONCE(s.lock, 0); // unlock(s)
>  	AMOSWAP %0, 1, s.lock  // lock(s)
> 	FENCE R, RW
> 	r1 = READ_ONCE(y); // load L
> 
> I think they can, because neither "FENCE RW, W" nor "FENCE R, RW" order
> them. Note that the reordering is allowed in LKMM, because unlock-lock
> only need to be as strong as RCtso.
> 
> Moreover, how about the outcome of the following case:
> 
> 	{ 
> 	r1, r2 are registers (variables) on each CPU, X, Y are memory
> 	locations, and initialized as 0
> 	}
> 
> 	CPU 0
> 	=====
> 	AMOSWAP r1, 1, X
> 	FENCE R, RW
> 	r2 = READ_ONCE(Y);
> 
> 	CPU 1
> 	=====
> 	WRITE_ONCE(Y, 1);
> 	FENCE RW, RW
> 	r2 = READ_ONCE(X);
> 
> can we observe the result where r2 on CPU0 is 0 while r2 on CPU1 is 1?
> 

As reminded by Andrea, what I meant to ask here is:

can we observer the result where r2 on CPU0 is 0 while r2 on CPU1 is 0?

Regards,
Boqun

> Regards,
> Boqun
> 
> > gives only TSO (R->R, R->W, W->W), because the two pieces of the AMO are
> > split, and that breaks the chain.  Getting full RW->RW between the critical
> > sections would therefore require an extra fence.  Also, the accesses to the
> > lockwords themselves would not be ordered without an extra fence.
> > 
> > > Last time this came up, I think the RISC-V folks were generally happy to
> > > implement whatever was necessary for Linux [1]. The thing that was stopping
> > > us was Power (see CONFIG_ARCH_WEAK_RELEASE_ACQUIRE), wasn't it? I think
> > > Michael saw quite a bit of variety in the impact on benchmarks [2] across
> > > different machines. So the question is whether newer Power machines are less
> > > affected to the degree that we could consider making this change again.
> > 
> > Yes, as I said above, RISC-V will implement what is needed to make this work.
> > 
> > Dan
> > 
> > > Will
> > > 
> > > [1] https://lore.kernel.org/lkml/11b27d32-4a8a-3f84-0f25-723095ef1076@nvidia.com/
> > > [2] https://lore.kernel.org/lkml/87tvp3xonl.fsf@concordia.ellerman.id.au/

  reply	other threads:[~2021-09-10  5:39 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26 18:28 [PATCH memory-model 0/5] Updates to the formal memory model Paul E. McKenney
2018-09-26 18:29 ` [PATCH memory-model 1/5] tools/memory-model: Add litmus-test naming scheme Paul E. McKenney
2018-10-02 10:10   ` [tip:locking/core] " tip-bot for Paul E. McKenney
2018-09-26 18:29 ` [PATCH memory-model 2/5] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire Paul E. McKenney
2018-10-02 10:11   ` [tip:locking/core] " tip-bot for Alan Stern
2021-09-08 11:00     ` Peter Zijlstra
2021-09-08 11:44       ` Peter Zijlstra
2021-09-08 14:42         ` Alan Stern
2021-09-08 15:12           ` Peter Zijlstra
2021-09-08 16:08           ` Linus Torvalds
2021-09-09  7:25             ` Peter Zijlstra
2021-09-09 13:35               ` Will Deacon
2021-09-09 17:02                 ` Linus Torvalds
2021-09-09 18:59                   ` Alan Stern
2021-09-09 17:03                 ` Dan Lustig
2021-09-09 18:00                   ` Paul E. McKenney
2021-09-10 14:20                     ` Boqun Feng
2021-09-10 15:33                       ` Palmer Dabbelt
2021-09-10 16:36                       ` Alan Stern
2021-09-10 17:12                         ` Peter Zijlstra
2021-09-10 17:56                           ` Alan Stern
2021-09-10 17:17                         ` Peter Zijlstra
2021-09-12  0:26                         ` Boqun Feng
2021-09-10  0:01                   ` Boqun Feng
2021-09-10  5:37                     ` Boqun Feng [this message]
2021-09-10  9:33                     ` Peter Zijlstra
2021-09-10 10:04                       ` Boqun Feng
2021-09-10 13:48                         ` Dan Lustig
2021-09-10 14:15                           ` Boqun Feng
2021-09-09 17:46                 ` Paul E. McKenney
2021-09-10 11:08                   ` Will Deacon
2021-09-17  3:21                     ` Nicholas Piggin
2021-09-17  5:31                       ` Nicholas Piggin
2021-09-17 14:36                     ` Michael Ellerman
2018-09-26 18:29 ` [PATCH memory-model 3/5] tools/memory-model: Fix a README typo Paul E. McKenney
2018-10-02 10:11   ` [tip:locking/core] " tip-bot for SeongJae Park
2018-09-26 18:29 ` [PATCH memory-model 4/5] tools/memory-model: Add more LKMM limitations Paul E. McKenney
2018-10-02 10:12   ` [tip:locking/core] " tip-bot for Paul E. McKenney
2018-09-26 18:29 ` [PATCH memory-model 5/5] doc: Replace smp_cond_acquire() with smp_cond_load_acquire() Paul E. McKenney
2018-10-02 10:12   ` [tip:locking/core] locking/memory-barriers: " tip-bot for Andrea Parri
2018-10-02  8:28 ` [PATCH memory-model 0/5] Updates to the formal memory model Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTrvLHB6lpol79ka@boqun-archlinux \
    --to=boqun.feng@gmail.com \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dlustig@nvidia.com \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=palmer@dabbelt.com \
    --cc=parri.andrea@gmail.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.weaver@maine.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.