All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Biggers <ebiggers@kernel.org>,
	"Tobin C. Harding" <me@tobin.cc>,
	linux-fsdevel@vger.kernel.org
Subject: Re: dcache locking question
Date: Sat, 16 Mar 2019 17:50:05 -0700	[thread overview]
Message-ID: <20190317005005.GY4102@linux.ibm.com> (raw)
In-Reply-To: <20190317001840.GF2217@ZenIV.linux.org.uk>

On Sun, Mar 17, 2019 at 12:18:40AM +0000, Al Viro wrote:
> On Sat, Mar 16, 2019 at 03:31:28PM -0700, Paul E. McKenney wrote:
> > > Paul, could you comment on that one?  The function in question is
> > > this:
> > > static struct dentry *__lock_parent(struct dentry *dentry)
> > > {
> > >         struct dentry *parent;
> > >         rcu_read_lock();
> > >         spin_unlock(&dentry->d_lock);
> > > again:
> > >         parent = READ_ONCE(dentry->d_parent);
> > >         spin_lock(&parent->d_lock);
> > >         /*
> > >          * We can't blindly lock dentry until we are sure
> > >          * that we won't violate the locking order.
> > >          * Any changes of dentry->d_parent must have
> > >          * been done with parent->d_lock held, so
> > >          * spin_lock() above is enough of a barrier
> > >          * for checking if it's still our child.
> > >          */
> > >         if (unlikely(parent != dentry->d_parent)) {
> > 
> > We are talking about the above access to dentry->d_parent, correct?
> > 
> > Given that dentry->d_parent isn't updated except when holding
> > dentry->d_lock, and given that locks do what they are supposed to,
> > both in terms of memory ordering and mutual exclusion, the value of
> > dentry->d_parent must be constant throughout the current critical section.
> > It therefore doesn't matter what strange code the compiler generates,
> > it will get the same value regardless.
> > 
> > So, no, there is absolutely no point in doing this:
> > 
> > 	if (unlikely(parent != READ_ONCE(dentry->d_parent))) {
> > 
> > Or did I miss the point of this discussion?
> 
> dentry->d_parent is *NOT* guaranteed to be stable here - we are not
> holding dentry->d_lock.  So it can change; what it can't do is to
> change to or from the pointer we have in 'parent'.

Color me blind to "dentry" vs. "parent".  :-/

> IOW, the value of dentry->d_parent isn't stable; the result of
> comparison, OTOH, is.
> 
> I also think that READ_ONCE would be useless here - reordering
> can't happen due to compiler barrier in spin_lock() and memory
> barriers should be provided by ACQUIRE/RELEASE on parent->d_lock.
> 
> We are observing a result of some store.  If we fetch the value
> equal to 'parent', that store had been under parent->d_lock *AND*
> any subsequent store changing the value of dentry->d_parent away
> from 'parent' would also have been under parent->d_lock.  So
> either it hasn't happened yet, or we would've observed its
> results, since we are holding parent->lock ourselves.
> 
> If we fetch the value not equal to 'parent', any subsequent store
> making dentry->d_parent equal to parent would've been done under
> parent->d_lock, so again, it either has not happened yet, or we
> would've observed its results.
> 
> FWIW, the whole point here is that we can't grab the lock sufficient
> to stabilize dentry->d_parent here - not until we'd verified that
> dentry still has the same parent.  We could play with trylock, but
> AFAICS the trick above is sufficient (and gets fewer retries).

I can come up with sequences of values that could mimic the value of
'parent', but that assumes load tearing.  I -have- seen stores of constant
values be torn, but not stores of runtime-variable values and not loads.
Still, such tearing is permitted, and including the READ_ONCE() is making
it easier for things like thread sanitizers.  In addition, the READ_ONCE()
makes it clear that the value being loaded is unstable, which can be
useful documentation.

And yes, I am a bit paranoid about this sort of thing.  Something about
talking to a lot of compiler writers and standards committee members...

But at the end of the day, you are the maintainer of this code.

> What I'm not certain about is whether we need READ_ONCE() on the
> other fetch in there.  Suppose that gets replaced with plain
> assignment; what problems could that cause?  It obviously can't
> get reordered past the spin_lock() - the address of spinlock
> is derived from the value we fetch.  And in case of retry...
> wouldn't spin_unlock(parent->d_lock) we do there supply a compiler
> barrier?

Plus all callers to a spin_trylock() before invoking this function,
which does prevent the compiler from repeating the read (no reason to
in this case) or fusing with some other read.  Of course, I still like
the documentation and sanitizer-tooling benefits.

> I realize that READ_ONCE() gives smp_read_barrier_depends(), but
> do we need it here?

We need smp_read_barrier_depends() if we dereference the loaded pointer
using some operation that did not imply memory ordering, which does
happen, for example, in ecryptfs_symlink().  But before the caller has
a chance to do that, there is that spin_lock_nested(), which provides
the needed ordering.

							Thanx, Paul


  reply	other threads:[~2019-03-17  0:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-14 22:56 dcache locking question Tobin C. Harding
2019-03-14 23:09 ` Matthew Wilcox
2019-03-15  1:38   ` Tobin C. Harding
2019-03-14 23:19 ` Tobin C. Harding
2019-03-15  1:50   ` Al Viro
2019-03-15 17:38     ` Eric Biggers
2019-03-15 18:54       ` Al Viro
2019-03-16 22:31         ` Paul E. McKenney
2019-03-17  0:18           ` Al Viro
2019-03-17  0:50             ` Paul E. McKenney [this message]
2019-03-17  2:20               ` James Bottomley
2019-03-17  3:06                 ` Al Viro
2019-03-17  4:23                   ` James Bottomley
2019-03-18  0:35                     ` Paul E. McKenney
2019-03-18 16:26                       ` James Bottomley
2019-03-18 17:11                         ` Paul E. McKenney
2019-03-19 15:45                           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190317005005.GY4102@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=ebiggers@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=me@tobin.cc \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.