Re: [PATCH 2/6] rcu: Remove superfluous full memory barrier upon first EQS snapshot

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Parri <parri.andrea@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <vschneid@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Neeraj Upadhyay <neeraj.upadhyay@amd.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	Zqiang <qiang.zhang1211@gmail.com>, rcu <rcu@vger.kernel.org>
Subject: Re: [PATCH 2/6] rcu: Remove superfluous full memory barrier upon first EQS snapshot
Date: Fri, 17 May 2024 18:27:12 +0200	[thread overview]
Message-ID: <ZkeFYJ1saaMWPUON@andrea> (raw)
In-Reply-To: <ZkdCG28qNha2vUSo@localhost.localdomain>

> Z6.0+pooncelock+poonceLock+pombonce.litmus shows an example of
> how full ordering is subtely incomplete without smp_mb__after_spinlock().
> 
> But still, smp_mb__after_unlock_lock() is supposed to be weaker than
> smp_mb__after_spinlock() and yet I'm failing to produce a litmus test
> that is successfull with the latter and fails with the former.

smp_mb__after_unlock_lock() is a nop without a matching unlock-lock;
smp_mb__after_spinlock() not quite...

C after_spinlock__vs__after_unlock_lock

{ }

P0(int *x, int *y, spinlock_t *s)
{
	int r0;

	WRITE_ONCE(*x, 1);
	spin_lock(s);
	smp_mb__after_spinlock();
	r0 = READ_ONCE(*y);
	spin_unlock(s);
}

P1(int *x, int *y)
{
	int r1;

	WRITE_ONCE(*y, 1);
	smp_mb();
	r1 = READ_ONCE(*x);
}

exists (0:r0=0 /\ 1:r1=0)


> For example, and assuming smp_mb__after_unlock_lock() is expected to be
> chained across locking, here is a litmus test inspired by
> Z6.0+pooncelock+poonceLock+pombonce.litmus that never observes the condition
> even though I would expect it should, as opposed to using
> smp_mb__after_spinlock():
> 
> C smp_mb__after_unlock_lock
> 
> {}
> 
> P0(int *w, int *x, spinlock_t *mylock)
> {
> 	spin_lock(mylock);
> 	WRITE_ONCE(*w, 1);
> 	WRITE_ONCE(*x, 1);
> 	spin_unlock(mylock);
> }
> 
> P1(int *x, int *y, spinlock_t *mylock)
> {
> 	int r0;
> 
> 	spin_lock(mylock);
> 	smp_mb__after_unlock_lock();
> 	r0 = READ_ONCE(*x);
> 	WRITE_ONCE(*y, 1);
> 	spin_unlock(mylock);
> }
> 
> P2(int *y, int *z, spinlock_t *mylock)
> {
> 	int r0;
> 
> 	spin_lock(mylock);
> 	r0 = READ_ONCE(*y);
> 	WRITE_ONCE(*z, 1);
> 	spin_unlock(mylock);
> }
> 
> P3(int *w, int *z)
> {
> 	int r1;
> 
> 	WRITE_ONCE(*z, 2);
> 	smp_mb();
> 	r1 = READ_ONCE(*w);
> }
> 
> exists (1:r0=1 /\ 2:r0=1 /\ z=2 /\ 3:r1=0)

Here's an informal argument to explain this outcome.  It is not the only
according to the LKMM, but the first that came to my mind.  And this is
longer than I wished.  TL; DR:  Full barriers are strong, really strong.

Remark full memory barriers share the following "strong-fence property":

  A ->full-barrier B

implies

  (SFP) A propagates (aka, is visible) to _every CPU before B executes

(cf. tools/memory-model/Documentation/explanation.txt for details about
the concepts of "propagation" and "execution").

For example, in the snippet above,

  P0:WRITE_ONCE(*w, 1) ->full-barrier P1:spin_unlock(mylock)

since

  P0:spin_unlock(mylock) ->reads-from P1:spin_lock(mylock) ->program-order P1:smp_mb__after_unlock_lock()

acts as a full memory barrier.   (1:r0=1 and 2:r0=1 together determine
the so called critical-sections' order (CSO).)

By contradiction,

  1) P0:WRITE_ONCE(*w, 1) propagates to P3 before P1:spin_unlock(mylock) executes   (SFP)

  2) P1:spin_unlock(mylock) executes before P2:spin_lock(mylock) executes   (CSO)

  3) P2:spin_lock(mylock) executes before P2:WRITE_ONCE(*z, 1) executes  (P2:spin_lock() is an ACQUIRE op)

  4) P2:WRITE_ONCE(*z, 1) executes before P2:WRITE_ONCE(*z, 1) propagates P3  (intuitively, a store is visible to the local CPU before being visible to a remote CPU)

  5) P2:WRITE_ONCE(*z, 1) propagates to P3 before P3:WRITE_ONCE(*z, 2) executes   (z=2)

  6) P3:WRITE_ONCE(*z, 2) executes before P3:WRITE_ONCE(*z, 2) propagates to P0    (a store is visible to the local CPU before being visible to a remote CPU)

  7) P3:WRITE_ONCE(*z, 2) propagates to P0 before P3:READ_ONCE(*w) executes   (SFP)

  8) P3:READ_ONCE(*w) executes before P0:WRITE_ONCE(*w, 1) propagates to P3   (3:r1=0)

Put together, (1-8) gives:

  P0:WRITE_ONCE(*w, 1) propagates to P3 before P0:WRITE_ONCE(*w, 1) propagates to P3

an absurd.

  Andrea

next prev parent reply	other threads:[~2024-05-17 16:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15 12:53 [PATCH 0/6] rcu: Remove several redundant memory barriers Frederic Weisbecker
2024-05-15 12:53 ` [PATCH 1/6] rcu: Remove full ordering on second EQS snapshot Frederic Weisbecker
2024-05-15 17:32   ` Valentin Schneider
2024-05-15 12:53 ` [PATCH 2/6] rcu: Remove superfluous full memory barrier upon first " Frederic Weisbecker
2024-05-16 15:31   ` Valentin Schneider
2024-05-16 16:08     ` Frederic Weisbecker
2024-05-16 17:08       ` Valentin Schneider
2024-05-17  7:29         ` Andrea Parri
2024-05-17 11:40           ` Frederic Weisbecker
2024-05-17 16:27             ` Andrea Parri [this message]
2024-05-15 12:53 ` [PATCH 3/6] rcu/exp: " Frederic Weisbecker
2024-05-15 12:53 ` [PATCH 4/6] rcu: Remove full memory barrier on boot time eqs sanity check Frederic Weisbecker
2024-05-16 17:09   ` Valentin Schneider
2024-05-15 12:53 ` [PATCH 5/6] rcu: Remove full memory barrier on RCU stall printout Frederic Weisbecker
2024-05-16 17:09   ` Valentin Schneider
2024-06-04  0:10   ` Paul E. McKenney
2024-06-04 11:13     ` Frederic Weisbecker
2024-06-04 14:00       ` Paul E. McKenney
2024-05-15 12:53 ` [PATCH 6/6] rcu/exp: Remove redundant full memory barrier at the end of GP Frederic Weisbecker
2024-05-15 17:32 ` [PATCH 0/6] rcu: Remove several redundant memory barriers Valentin Schneider
2024-05-15 23:13   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZkeFYJ1saaMWPUON@andrea \
    --to=parri.andrea@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.upadhyay@amd.com \
    --cc=paulmck@kernel.org \
    --cc=qiang.zhang1211@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=urezki@gmail.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.