From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Akira Yokosawa <akiyks@gmail.com>
Cc: Andrea Parri <andrea.parri@amarulasolutions.com>,
Alan Stern <stern@rowland.harvard.edu>,
Boqun Feng <boqun.feng@gmail.com>,
Daniel Lustig <dlustig@nvidia.com>,
David Howells <dhowells@redhat.com>,
Jade Alglave <j.alglave@ucl.ac.uk>,
Luc Maranget <luc.maranget@inria.fr>,
Nicholas Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will.deacon@arm.com>,
Daniel Kroening <kroening@cs.ox.ac.uk>,
Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: Adding plain accesses and detecting data races in the LKMM
Date: Fri, 19 Apr 2019 11:06:41 -0700 [thread overview]
Message-ID: <20190419180641.GD14111@linux.ibm.com> (raw)
In-Reply-To: <2827195a-f203-b9cd-444d-cf6425cef06f@gmail.com>
On Sat, Apr 20, 2019 at 12:06:58AM +0900, Akira Yokosawa wrote:
> Hi Paul,
>
> Please find inline comments below.
>
> On Fri, 19 Apr 2019 05:47:20 -0700, Paul E. McKenney wrote:
> > On Fri, Apr 19, 2019 at 02:53:02AM +0200, Andrea Parri wrote:
> >>> Are you saying that on x86, atomic_inc() acts as a full memory barrier
> >>> but not as a compiler barrier, and vice versa for
> >>> smp_mb__after_atomic()? Or that neither atomic_inc() nor
> >>> smp_mb__after_atomic() implements a full memory barrier?
> >>
> >> I'd say the former; AFAICT, these boil down to:
> >>
> >> https://elixir.bootlin.com/linux/v5.1-rc5/source/arch/x86/include/asm/atomic.h#L95
> >> https://elixir.bootlin.com/linux/v5.1-rc5/source/arch/x86/include/asm/barrier.h#L84
> >
> > OK, how about the following?
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit 19d166dadc4e1bba4b248fb46d32ca4f2d10896b
> > Author: Paul E. McKenney <paulmck@linux.ibm.com>
> > Date: Fri Apr 19 05:20:30 2019 -0700
> >
> > tools/memory-model: Make smp_mb__{before,after}_atomic() match x86
> >
> > Read-modify-write atomic operations that do not return values need not
> > provide any ordering guarantees, and this means that both the compiler
> > and the CPU are free to reorder accesses across things like atomic_inc()
> > and atomic_dec(). The stronger systems such as x86 allow the compiler
> > to do the reordering, but prevent the CPU from so doing, and these
> > systems implement smp_mb__{before,after}_atomic() as compiler barriers.
> > The weaker systems such as Power allow both the compiler and the CPU
> > to reorder accesses across things like atomic_inc() and atomic_dec(),
> > and implement smp_mb__{before,after}_atomic() as full memory barriers.
> >
> > This means that smp_mb__before_atomic() only orders the atomic operation
> > itself with accesses preceding the smp_mb__before_atomic(), and does
> > not necessarily provide any ordering whatsoever against accesses
> > folowing the atomic operation. Similarly, smp_mb__after_atomic()
>
> s/folowing/following/
Good eyes, fixed!
> > only orders the atomic operation itself with accesses following the
> > smp_mb__after_atomic(), and does not necessarily provide any ordering
> > whatsoever against accesses preceding the atomic operation. Full ordering
> > therefore requires both an smp_mb__before_atomic() before the atomic
> > operation and an smp_mb__after_atomic() after the atomic operation.
> >
> > Therefore, linux-kernel.cat's current model of Before-atomic
> > and After-atomic is too strong, as it guarantees ordering of
> > accesses on the other side of the atomic operation from the
> > smp_mb__{before,after}_atomic(). This commit therefore weakens
> > the guarantee to match the semantics called out above.
> >
> > Reported-by: Andrea Parri <andrea.parri@amarulasolutions.com>
> > Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> >
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index 169d938c0b53..e5b97c3e8e39 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -1888,7 +1888,37 @@ There are some more advanced barrier functions:
> > atomic_dec(&obj->ref_count);
> >
> > This makes sure that the death mark on the object is perceived to be set
> > - *before* the reference counter is decremented.
> > + *before* the reference counter is decremented. However, please note
> > + that smp_mb__before_atomic()'s ordering guarantee does not necessarily
> > + extend beyond the atomic operation. For example:
> > +
> > + obj->dead = 1;
> > + smp_mb__before_atomic();
> > + atomic_dec(&obj->ref_count);
> > + r1 = a;
> > +
> > + Here the store to obj->dead is not guaranteed to be ordered with
> > + with the load from a. This reordering can happen on x86 as follows:
>
> s/with//
Fixed fixed. ;-)
> And I beg you to avoid using the single letter variable "a".
> It's confusing.
Good point! I changed it to "x".
> > + (1) The compiler can reorder the load from a to precede the
> > + atomic_dec(), (2) Because x86 smp_mb__before_atomic() is only a
> > + compiler barrier, the CPU can reorder the preceding store to
> > + obj->dead with the later load from a.
> > +
> > + This could be avoided by using READ_ONCE(), which would prevent the
> > + compiler from reordering due to both atomic_dec() and READ_ONCE()
> > + being volatile accesses, and is usually preferable for loads from
> > + shared variables. However, weakly ordered CPUs would still be
> > + free to reorder the atomic_dec() with the load from a, so a more
> > + readable option is to also use smp_mb__after_atomic() as follows:
>
> The point here is not just "readability", but also the portability of the
> code, isn't it?
As Andrea noted, in this particular case, the guarantee that the
store to obj->dead precedes the load from x is portable. Either the
smp_mb__before_atomic() or the atomic_dec() must provide the ordering.
However, you are right that there is some non-portability. But this
non-portability involves the order of the atomic_dec() and the store to x.
So what I did was ...
> Thanks, Akira
>
> > +
> > + WRITE_ONCE(obj->dead, 1);
> > + smp_mb__before_atomic();
> > + atomic_dec(&obj->ref_count);
> > + smp_mb__after_atomic();
> > + r1 = READ_ONCE(a);
> > +
> > + This orders all three accesses against each other, and also makes
> > + the intent quite clear.
... change the above paragraph to read as follows:
In addition, the example without the smp_mb__after_atomic() does
not necessarily order the atomic_dec() with the load from x.
In contrast, the example with both smp_mb__before_atomic() and
smp_mb__after_atomic() orders all three accesses against each other,
and also makes the intent quite clear.
Does that help?
Thanx, Paul
> > See Documentation/atomic_{t,bitops}.txt for more information.
> >
> > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
> > index 8dcb37835b61..b6866f93abb8 100644
> > --- a/tools/memory-model/linux-kernel.cat
> > +++ b/tools/memory-model/linux-kernel.cat
> > @@ -28,8 +28,8 @@ include "lock.cat"
> > let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn]
> > let wmb = [W] ; fencerel(Wmb) ; [W]
> > let mb = ([M] ; fencerel(Mb) ; [M]) |
> > - ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > - ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > + ([M] ; fencerel(Before-atomic) ; [RMW]) |
> > + ([RMW] ; fencerel(After-atomic) ; [M]) |
> > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > fencerel(After-unlock-lock) ; [M])
> >
>
next prev parent reply other threads:[~2019-04-19 19:10 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-19 19:38 Adding plain accesses and detecting data races in the LKMM Alan Stern
2019-04-02 14:42 ` Andrea Parri
2019-04-02 18:06 ` Alan Stern
2019-04-06 0:49 ` Andrea Parri
2019-04-06 16:03 ` Alan Stern
2019-04-08 5:51 ` Andrea Parri
2019-04-08 14:18 ` Alan Stern
2019-04-09 1:36 ` Andrea Parri
2019-04-09 15:01 ` Paul E. McKenney
2019-04-13 21:39 ` Andrea Parri
2019-04-15 13:35 ` Paul E. McKenney
2019-04-15 13:50 ` Alan Stern
2019-04-15 13:53 ` Paul E. McKenney
2019-04-18 12:54 ` Andrea Parri
2019-04-18 17:44 ` Alan Stern
2019-04-18 18:39 ` Paul E. McKenney
2019-04-18 20:19 ` Alan Stern
2019-04-19 0:53 ` Andrea Parri
2019-04-19 12:47 ` Paul E. McKenney
2019-04-19 14:34 ` Alan Stern
2019-04-19 17:17 ` Paul E. McKenney
2019-04-19 15:06 ` Akira Yokosawa
2019-04-19 16:37 ` Andrea Parri
2019-04-19 18:06 ` Paul E. McKenney [this message]
2019-04-20 14:50 ` Akira Yokosawa
2019-04-21 19:38 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190419180641.GD14111@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=akiyks@gmail.com \
--cc=andrea.parri@amarulasolutions.com \
--cc=boqun.feng@gmail.com \
--cc=dhowells@redhat.com \
--cc=dlustig@nvidia.com \
--cc=j.alglave@ucl.ac.uk \
--cc=kroening@cs.ox.ac.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=luc.maranget@inria.fr \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=stern@rowland.harvard.edu \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.