From: "Paul E. McKenney" <paulmck@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
tytso@mit.edu, linux-kernel@vger.kernel.org,
linux-crypto@vger.kernel.org, stable@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] random: use correct memory barriers for crng_node_pool
Date: Tue, 22 Sep 2020 11:31:00 -0700 [thread overview]
Message-ID: <20200922183100.GZ29330@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200921235243.GA32959@sol.localdomain>
On Mon, Sep 21, 2020 at 04:52:43PM -0700, Eric Biggers wrote:
> On Mon, Sep 21, 2020 at 04:26:39PM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 22, 2020 at 08:11:04AM +1000, Herbert Xu wrote:
> > > On Mon, Sep 21, 2020 at 08:27:14AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Sep 21, 2020 at 06:19:39PM +1000, Herbert Xu wrote:
> > > > > On Thu, Sep 17, 2020 at 09:58:02AM -0700, Eric Biggers wrote:
> > > > > >
> > > > > > smp_load_acquire() is obviously correct, whereas READ_ONCE() is an optimization
> > > > > > that is difficult to tell whether it's correct or not. For trivial data
> > > > > > structures it's "easy" to tell. But whenever there is a->b where b is an
> > > > > > internal implementation detail of another kernel subsystem, the use of which
> > > > > > could involve accesses to global or static data (for example, spin_lock()
> > > > > > accessing lockdep stuff), a control dependency can slip in.
> > > > >
> > > > > If we're going to follow this line of reasoning, surely you should
> > > > > be converting the RCU derference first and foremost, no?
> > >
> > > ...
> > >
> > > > And to Eric's point, it is also true that when you have pointers to
> > > > static data, and when the compiler can guess this, you do need something
> > > > like smp_load_acquire(). But this is a problem only when you are (1)
> > > > using feedback-driven compiler optimization or (2) when you compare the
> > > > pointer to the address of the static data.
> > >
> > > Let me restate what I think Eric is saying. He is concerned about
> > > the case where a->b and b is some opaque object that may in turn
> > > dereference a global data structure unconnected to a. The case
> > > in question here is crng_node_pool in drivers/char/random.c which
> > > in turn contains a spin lock.
> >
> > As long as the compiler generates code that reaches that global via
> > pointer a, everything will work fine. Which it will, unless the guy
> > writing the code makes the mistake of introducing a comparison between the
> > pointer to be dereferenced and the address of the global data structure.
> >
> > So this is OK:
> >
> > p = rcu_dereference(a);
> > do_something(p->b);
> >
> > This is not OK:
> >
> > p = rcu_dereference(a);
> > if (p == &some_global_variable)
> > we_really_should_not_have_done_that_comparison();
> > do_something(p->b);
>
> If you call some function that's an internal implementation detail of some other
> kernel subsystem, how do you know it doesn't do that?
If the only globals I insert into my linked data structure are local to my
compilation unit, then the internal implementation details of some other
kernel subsystem in some other translation unit cannot do that comparison.
> Also, it's not just the p == &global_variable case. Consider:
>
> struct a { struct b *b; };
> struct b { ... };
>
> Thread 1:
>
> /* one-time initialized data shared by all instances of b */
> static struct c *c;
>
> void init_b(struct a *a)
> {
> if (!c)
> c = alloc_c();
>
> smp_store_release(&a->b, kzalloc(sizeof(struct b)));
> }
>
> Thread 2:
>
> void use_b_if_present(struct a *a)
> {
> struct b *b = READ_ONCE(a->b);
>
> if (b) {
> c->... # crashes because c still appears to be NULL
> }
> }
>
>
> So when the *first* "b" is allocated, the global data "c" is initialized. Then
> when using a "b", we expect to be able to access "c". But there's no
> data dependency from "b" to "c"; it's a control dependency only.
> So smp_load_acquire() is needed, not READ_ONCE().
>
> And it can be an internal implementation detail of "b"'s subsystem whether it
> happens to use global data "c".
Given that "c" is static, these two subsystems must be in the same
translation unit. So I don't see how this qualifies as being internal to
"b"'s subsystem.
Besides which, control dependencies should be used only by LKMM experts
at this point. Yes, we are trying to get the compiler people to give us
a way to tell the compiler about dependencies that we need to preserve,
but in the meantime, you beed to be really careful how you use them,
and you need to make sure that your external API can be used without
creating traps like the one you are driving at.
> This sort of thing is why people objected to the READ_ONCE() optimization during
> the discussion at
> https://lkml.kernel.org/linux-fsdevel/20200717044427.68747-1-ebiggers@kernel.org/T/#u.
> Most kernel developers aren't experts in the LKMM, and they want something
> that's guaranteed to be correct without having to to think really hard about it
> and make assumptions about the internal implementation details of other
> subsystems, how compilers have implemented the C standard, and so on.
And smp_load_acquire()is provided for that reason. Its name was
even based on the nomenclature used in the C standard and elsewhere.
And again, control dependencies are for LKMM experts, as they are very
tricky to get right.
But in the LKMM documentation, you are likely to find LKMM experts who
want to optimize all the way, particularly in cases like the one-time
init pattern where all the data is often local. And the best basis for
READ_ONCE() in one-time init is not a control dependency, but rather
ordering of accesses to a single variable from a single task combined
with locking, both of which are quite robust and much easier to use,
especially in comparison to control dependencies.
My goal for LKMM is not that each and every developer have a full
understanding of every nook and cranny of that model, but instead that
people can find the primitives supporting the desired point in the
performance/simplicity tradoff space. And yes, I have more writing
to do to make more progress towards that goal.
Thanx, Paul
next prev parent reply other threads:[~2020-09-22 18:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-16 23:30 [PATCH] random: use correct memory barriers for crng_node_pool Eric Biggers
2020-09-17 7:26 ` Herbert Xu
2020-09-17 16:58 ` Eric Biggers
2020-09-21 8:19 ` Herbert Xu
2020-09-21 15:27 ` Paul E. McKenney
2020-09-21 22:11 ` Herbert Xu
2020-09-21 23:26 ` Paul E. McKenney
2020-09-21 23:51 ` Herbert Xu
2020-09-22 18:42 ` Paul E. McKenney
2020-09-22 18:59 ` Eric Biggers
2020-09-22 20:31 ` Paul E. McKenney
2020-09-21 23:52 ` Eric Biggers
2020-09-22 18:31 ` Paul E. McKenney [this message]
2020-09-22 19:09 ` Eric Biggers
2020-09-22 20:56 ` Paul E. McKenney
2020-09-22 21:55 ` Eric Biggers
2020-09-25 0:59 ` Paul E. McKenney
2020-09-25 2:09 ` Eric Biggers
2020-09-25 3:31 ` Paul E. McKenney
2020-10-02 3:07 ` Eric Biggers
2020-10-08 18:31 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200922183100.GZ29330@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=ebiggers@kernel.org \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox