From: Alan Stern <stern@rowland.harvard.edu>
To: Jonas Oberhauser <jonas.oberhauser@huawei.com>
Cc: "paulmck@kernel.org" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
"parri.andrea" <parri.andrea@gmail.com>, will <will@kernel.org>,
"boqun.feng" <boqun.feng@gmail.com>, npiggin <npiggin@gmail.com>,
dhowells <dhowells@redhat.com>, "j.alglave" <j.alglave@ucl.ac.uk>,
"luc.maranget" <luc.maranget@inria.fr>, akiyks <akiyks@gmail.com>,
dlustig <dlustig@nvidia.com>, joel <joel@joelfernandes.org>,
urezki <urezki@gmail.com>,
quic_neeraju <quic_neeraju@quicinc.com>,
frederic <frederic@kernel.org>,
Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)
Date: Thu, 19 Jan 2023 11:41:01 -0500 [thread overview]
Message-ID: <Y8lynRI35cFeuqb5@rowland.harvard.edu> (raw)
In-Reply-To: <a2b89243-ddd7-c114-541f-0aff7806d217@huaweicloud.com>
On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 3:28 AM, Alan Stern wrote:
> > > This is a permanent error; I've given up. Sorry it didn't
> > work out.
>
> [It seems the e-mail still reached me through the mailing list]
[For everyone else, Jonas is referring to the fact that the last two
emails I sent to his huaweicloud.com address could not be delivered, so
I copied them off-list to his huawei.com address.]
> > > I consider that a hack though and don't like it.
> > It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> > is a lot like a load, in that it returns a value obtained by reading
> > something from memory (along with some other operations, though, so it
> > isn't a simple straightforward read -- perhaps more like an
> > atomic_inc_return_relaxed).
> The issue I have with this is that it might create accidental ordering. How
> does it behave when you throw fences in the mix?
I think this isn't going to be a problem. Certainly any real
implementation of scru_read_lock() is going to involve some actual load
operations, so any unintentional ordering caused by fences will also
apply to real executions. Likewise for srcu_read_unlock and store
operations.
> It really does not work like an increment at all, I think srcu_read_lock()
> only reads the currently active index, but the index is changed by
> srcu_sync. But even that is an implementation detail of sorts. I think the
> best way to think of it would be for srcu_read_lock to just return an
> arbitrary value.
I think I'll stick to it always returning the initial value. Paul said
that would be okay.
> The user can not rely on any kind of "accidental" rfe edges between these
> events for ordering.
>
> Perhaps if you flag any use of these values in address or control
> dependencies, as well as any event which depends on more than one of these
> values, you could prove that it's impossible to contrain the behavior
> through these rfe(and/or co) edges because you can anyways never inspect the
> value returned by the operation (except to pass it into srcu_unlock).
>
> Or you might be able to explicitly eliminate the events everywhere, just
> like you have done for carry-dep in your patch.
On second thought, I'll make it impossible to read from the
srcu_read_unlock events by removing them from the rf (and rfi/rfe)
relation. Then it won't be necessary to change carry-dep or anything
else.
> But it looks so brittle.
Maybe not so much after this change?
> > srcu_read_unlock() is somewhat less like a store, but it does have one
> > notable similarity: It takes an input value and therefore can be the
> > target of a data dependency. The biggest difference is that an
> > srcu_read_unlock() can't really be read-from. It would be nice if herd
> > had an event type that behaved this way.
> Or if you could declare your own : )
> Obviously, you will have accidental rf edges going from srcu_read_unlock()
> to srcu_read_lock() if you model them this way.
Not when those edges are erased.
> > Also, herd doesn't have any way of saying that the value passed to a
> > store is an unmodified copy of the value obtained by a load. In our
> > case that doesn't matter much -- nobody should be writing litmus tests
> > in which the value returned by srcu_read_lock() is incremented and then
> > decremented again before being passed to srcu_write_lock()!
>
> It would be nice if herd allowed declaring structs that can be used for such
> purposes.
> (anyways, I am not sure if Luc is still following everything in this deeply
> nested thread that started somewhere completely different. But maybe if Paul
> or you open a feature request, let me know so that I can give my 2ct)
I thought you were against adding into herd features that were specific
to the Linux kernel?
> > > Note that if your ordering relies on actually using gp twice in a row, then
> > > these must come from strong-fence, but you should be able to just take the
> > > shortcut by merging them into a single gp.
> > > po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order
> > I don't know what you mean by this. The example above does rely on
> > having two synchronize_rcu() calls; with only one it would be allowed.
>
> I mean that if you have a cycle that is formed by having two adjacent actual
> `gp` edges, like .... ; gp;gp ; .... with gp= po ; rcu-gp ; po?,
> (not like your example, where the cycle uses two *rcu*-gp but no gp edges)
Don't forget that I had in mind a version of the model where rcu-gp did
not exist.
> and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to use gp'
> instead of gp,
> then there are two cases for how that cycle came to be, either 1) as
> ... ; hb;hb ; ....
> but then you can refactor as
> ... ; po;rcu-gp;po;rcu-gp;po ; ...
> ... ; po;rcu-gp; po ; ...
> ... ; gp' ; ...
> ... ; hb' ; ...
> which again creates a cycle, or 2) as
> ... ; pb ; hb ; ...
> coming from
> ... ; prop ; gp ; gp ; ....
> which you can similarly refactor as
> ... ; prop ; po;rcu-gp;po ; ....
> ... ; prop ; gp' ; ....
> and again get a cycle with
> ... ; pb' ; ....
> Therefore, gp = po;rcu-gp;po should be equivalent.
The point is that in P1, we have Write ->(gp;gp) Read, but we do not
have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
gp' instead of gp, you'll analyze the litmus test as if it had only one
grace period but two critical sections, getting a wrong answer.
Here's a totally different way of thinking about these things, which may
prove enlightening. These thoughts originally occurred to me years ago,
and I had forgotten about them until last night.
If G is a grace period, let's write t1(G) for the time when G starts and
t2(G) for the time when G ends.
Likewise, if C is a read-side critical section, let's write t2(C) for
the time when C starts (or the lock executes if you prefer) and t1(C)
for the time when C ends (or the unlock executes). This terminology
reflects the "backward" role that critical sections play in the memory
model.
Now we can can characterize rcu-order and rcu-link in operational terms.
Let A and B each be either a grace period or a read-side critical
section. Then:
A ->rcu-order B means t1(A) < t2(B), and
A ->rcu-link B means t2(A) <= t1(B).
(Of course, we always have t1(X) < t2(X) for any grace period or
critical section X.)
This explains quite a lot. For example, we can justify including
C ->rcu-link G
into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
t1(G), in other words, C starts when or before G starts. Then the
Fundamental Law of RCU says that C must end before G ends, since
otherwise C would span all of G. Thus t1(C) < t2(G), which is C
->rcu-order G.
The case of G ->rcu-link C is similar.
This also explains why rcu-link can be extended by appending (rcu-order
; rcu-link)*. From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if
A ->rcu-link B ->(rcu-order ; rcu-link)* C
then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.
The same sort of argument shows that rcu-order should be extendable by
appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.
This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
holds because t1(G) < t2(G). But for critical sections we have t2(C) <
t1(C) and so C ->rcu-order C does not hold.
Assuming ordinary memory accesses occur in a single instant, you see why
it makes sense to consider (po ; rcu-order ; po) an ordering. But when
you're comparing grace periods or critical sections to each other,
things get a little ambiguous. Should G1 be considered to come before
G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
Springing for (po ; rcu-order ; po?) amounts to choosing the second
alternative.
Alan
next prev parent reply other threads:[~2023-01-19 16:41 UTC|newest]
Thread overview: 161+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220921173109.GA1214281@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <YytfFiMT2Xsdwowf@rowland.harvard.edu>
[not found] ` <YywXuzZ/922LHfjI@hirez.programming.kicks-ass.net>
[not found] ` <114ECED5-FED1-4361-94F7-8D9BC02449B7>
[not found] ` <YzSAnclenTz7KQyt@rowland.harvard.edu>
[not found] ` <f763bd5ff835458d8750b61da47fe316@huawei.com>
2023-01-03 18:56 ` Internal vs. external barriers (was: Re: Interesting LKMM litmus test) Alan Stern
2023-01-04 15:37 ` Andrea Parri
2023-01-04 20:58 ` Alan Stern
[not found] ` <ee186bc17a5e48298a5373f688496dce@huawei.com>
2023-01-05 17:32 ` Paul E. McKenney
[not found] ` <bea712c82e6346f8973399a5711ff78a@huawei.com>
2023-01-11 15:06 ` Alan Stern
[not found] ` <768ffe7edc7f4ddfacd5b0a8e844ed84@huawei.com>
2023-01-11 17:01 ` Alan Stern
[not found] ` <07579baee4b84532a76ea8b0b33052bb@huawei.com>
2023-01-12 21:57 ` Paul E. McKenney
2023-01-13 16:38 ` Alan Stern
2023-01-13 19:54 ` Paul E. McKenney
[not found] ` <06a8aef7eb8d46bca34521a80880dae3@huawei.com>
2023-01-14 17:42 ` Paul E. McKenney
[not found] ` <e51c82a113484b6bb62354a49376f248@huawei.com>
2023-01-14 16:42 ` Alan Stern
2023-01-17 17:48 ` Jonas Oberhauser
2023-01-17 21:19 ` Alan Stern
2023-01-18 11:25 ` Jonas Oberhauser
2023-01-19 2:28 ` Alan Stern
2023-01-19 11:22 ` Jonas Oberhauser
2023-01-19 16:41 ` Alan Stern [this message]
2023-01-19 18:43 ` Paul E. McKenney
2023-01-23 16:16 ` Jonas Oberhauser
2023-01-23 19:58 ` Alan Stern
2023-01-23 20:06 ` Jonas Oberhauser
2023-01-23 20:41 ` Alan Stern
2023-01-24 13:21 ` Jonas Oberhauser
2023-01-24 15:54 ` Jonas Oberhauser
2023-01-24 17:22 ` Alan Stern
[not found] ` <4c1abc7733794519ad7c5153ae8b58f9@huawei.com>
2023-01-13 16:28 ` Alan Stern
2023-01-13 20:07 ` Paul E. McKenney
2023-01-13 20:32 ` Paul E. McKenney
2023-01-14 17:40 ` Alan Stern
2023-01-14 17:48 ` Paul E. McKenney
[not found] ` <136d019d8c8049f6b737627df830e66f@huawei.com>
2023-01-14 17:53 ` Paul E. McKenney
2023-01-14 18:15 ` Paul E. McKenney
2023-01-14 19:58 ` Alan Stern
2023-01-15 5:19 ` Paul E. McKenney
2023-01-14 20:19 ` Alan Stern
2023-01-15 5:15 ` Paul E. McKenney
2023-01-15 16:23 ` Alan Stern
2023-01-15 18:10 ` Paul E. McKenney
2023-01-15 20:46 ` Alan Stern
2023-01-16 4:23 ` Paul E. McKenney
2023-01-16 18:11 ` Alan Stern
2023-01-16 19:06 ` Paul E. McKenney
2023-01-16 19:20 ` Alan Stern
2023-01-16 22:13 ` Paul E. McKenney
2023-01-17 11:46 ` Andrea Parri
2023-01-17 15:14 ` Paul E. McKenney
2023-01-17 15:56 ` Alan Stern
2023-01-17 17:43 ` Paul E. McKenney
2023-01-17 18:27 ` Jonas Oberhauser
2023-01-17 18:55 ` Paul E. McKenney
2023-01-17 20:20 ` Jonas Oberhauser
2023-01-17 20:15 ` Alan Stern
2023-01-18 3:50 ` Paul E. McKenney
2023-01-18 16:50 ` Alan Stern
2023-01-18 19:42 ` Jonas Oberhauser
2023-01-18 20:19 ` Paul E. McKenney
2023-01-18 20:30 ` Jonas Oberhauser
2023-01-18 21:12 ` Paul E. McKenney
2023-01-18 21:24 ` Jonas Oberhauser
2023-01-19 0:11 ` Paul E. McKenney
2023-01-19 13:39 ` Jonas Oberhauser
2023-01-19 18:41 ` Paul E. McKenney
2023-01-19 19:51 ` Alan Stern
2023-01-19 21:53 ` Paul E. McKenney
2023-01-19 22:04 ` Alan Stern
2023-01-19 23:03 ` Paul E. McKenney
2023-01-20 9:43 ` Jonas Oberhauser
2023-01-20 15:39 ` Paul E. McKenney
2023-01-20 20:46 ` Jonas Oberhauser
2023-01-20 21:37 ` Paul E. McKenney
2023-01-20 22:36 ` Jonas Oberhauser
2023-01-20 23:19 ` Paul E. McKenney
2023-01-21 0:03 ` Jonas Oberhauser
2023-01-21 0:34 ` Paul E. McKenney
2023-01-20 3:55 ` Paul E. McKenney
2023-01-20 9:20 ` Jonas Oberhauser
2023-01-20 12:34 ` Jonas Oberhauser
2023-01-20 12:51 ` Jonas Oberhauser
2023-01-20 15:32 ` Paul E. McKenney
2023-01-20 20:56 ` Jonas Oberhauser
2023-01-20 21:40 ` Paul E. McKenney
2023-01-20 16:14 ` Alan Stern
2023-01-20 17:30 ` Paul E. McKenney
2023-01-20 18:15 ` Alan Stern
2023-01-20 18:59 ` Paul E. McKenney
2023-01-20 10:13 ` Jonas Oberhauser
2023-01-20 15:47 ` Paul E. McKenney
2023-01-20 22:21 ` Jonas Oberhauser
2023-01-20 16:18 ` Alan Stern
2023-01-20 21:41 ` Jonas Oberhauser
2023-01-21 4:38 ` Paul E. McKenney
2023-01-21 17:36 ` Alan Stern
2023-01-21 18:40 ` Paul E. McKenney
2023-01-21 19:56 ` Alan Stern
2023-01-21 20:10 ` Paul E. McKenney
2023-01-21 21:03 ` Alan Stern
2023-01-21 23:49 ` Paul E. McKenney
2023-01-23 11:48 ` Jonas Oberhauser
2023-01-23 15:55 ` Alan Stern
2023-01-23 19:40 ` Jonas Oberhauser
2023-01-23 20:34 ` Alan Stern
2023-01-18 20:06 ` Paul E. McKenney
2023-01-18 20:54 ` Alan Stern
2023-01-18 21:05 ` Jonas Oberhauser
2023-01-19 0:02 ` Paul E. McKenney
2023-01-19 2:19 ` Alan Stern
2023-01-19 11:23 ` Paul E. McKenney
2023-01-20 16:01 ` Alan Stern
2023-01-20 17:58 ` Paul E. McKenney
2023-01-20 18:37 ` Alan Stern
2023-01-20 19:20 ` Paul E. McKenney
2023-01-20 20:36 ` Alan Stern
2023-01-20 21:20 ` Paul E. McKenney
2023-01-22 20:32 ` Alan Stern
2023-01-23 20:16 ` Paul E. McKenney
2023-01-24 2:18 ` Alan Stern
2023-01-24 4:06 ` Paul E. McKenney
2023-01-24 11:09 ` Andrea Parri
2023-01-24 14:54 ` Paul E. McKenney
2023-01-24 15:11 ` Jonas Oberhauser
2023-01-24 16:22 ` Paul E. McKenney
2023-01-24 16:39 ` Jonas Oberhauser
2023-01-24 17:26 ` Paul E. McKenney
2023-01-24 19:30 ` Jonas Oberhauser
2023-01-24 22:15 ` Paul E. McKenney
2023-01-24 22:35 ` Alan Stern
2023-01-24 22:54 ` Paul E. McKenney
2023-01-25 1:54 ` Alan Stern
2023-01-25 2:20 ` Paul E. McKenney
2023-01-25 13:10 ` Jonas Oberhauser
2023-01-25 15:05 ` Paul E. McKenney
2023-01-25 15:34 ` Alan Stern
2023-01-25 17:18 ` Paul E. McKenney
2023-01-25 17:42 ` Jonas Oberhauser
2023-01-25 19:08 ` Alan Stern
2023-01-25 19:46 ` Paul E. McKenney
2023-01-25 20:36 ` Andrea Parri
2023-01-25 21:10 ` Jonas Oberhauser
2023-01-25 21:23 ` Paul E. McKenney
2023-01-25 20:46 ` Alan Stern
2023-01-25 21:38 ` Paul E. McKenney
2023-01-25 23:33 ` Paul E. McKenney
2023-01-26 1:45 ` Alan Stern
2023-01-26 1:53 ` Paul E. McKenney
2023-01-26 12:17 ` Jonas Oberhauser
2023-01-26 18:48 ` Paul E. McKenney
2023-01-27 15:03 ` Jonas Oberhauser
2023-01-27 16:50 ` Paul E. McKenney
2023-01-27 16:54 ` Paul E. McKenney
2023-01-18 19:57 ` Jonas Oberhauser
2023-01-18 21:06 ` Paul E. McKenney
2023-01-18 2:15 ` Alan Stern
2023-01-18 5:17 ` Paul E. McKenney
2023-01-18 16:03 ` Alan Stern
2023-01-18 16:59 ` Boqun Feng
2023-01-18 17:08 ` Alan Stern
2023-01-18 17:41 ` Paul E. McKenney
2023-01-19 19:07 ` Paul E. McKenney
2023-01-14 16:55 ` Alan Stern
2023-01-14 17:35 ` Paul E. McKenney
[not found] ` <17078dd97cb6480f9c51e27058af3197@huawei.com>
2023-01-14 17:27 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8lynRI35cFeuqb5@rowland.harvard.edu \
--to=stern@rowland.harvard.edu \
--cc=akiyks@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=dhowells@redhat.com \
--cc=dlustig@nvidia.com \
--cc=frederic@kernel.org \
--cc=j.alglave@ucl.ac.uk \
--cc=joel@joelfernandes.org \
--cc=jonas.oberhauser@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luc.maranget@inria.fr \
--cc=npiggin@gmail.com \
--cc=parri.andrea@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=quic_neeraju@quicinc.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.