linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josh Triplett <josh@joshtriplett.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH 17/17] RCU'd vfsmounts
Date: Thu, 3 Oct 2013 16:28:27 -0700	[thread overview]
Message-ID: <20131003232826.GA6604@jtriplet-mobl1> (raw)
In-Reply-To: <CA+55aFybjMDwdfV2uoGL=8U4c1tytjMwk4sU187e6ha2xQ=3+Q@mail.gmail.com>

On Thu, Oct 03, 2013 at 01:52:45PM -0700, Linus Torvalds wrote:
> On Thu, Oct 3, 2013 at 1:41 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > The problem is this:
> > A = 1, B = 1
> > CPU1:
> > A = 0
> > <full barrier>
> > synchronize_rcu()
> > read B
> >
> > CPU2:
> > rcu_read_lock()
> > B = 0
> > read A
> >
> > Are we guaranteed that we won't get both of them seeing ones, in situation
> > when that rcu_read_lock() comes too late to be noticed by synchronize_rcu()?
> 
> Yeah, I think we should be guaranteed that, because the
> synchronize_rcu() will guarantee that all other CPU's go through an
> idle period. So the "read A" on CPU2 cannot possibly see a 1 _unless_
> it happens so early that synchronize_rcu() definitely sees it (ie it's
> a "preexisting reader" by definition), in which case synchronize_rcu()
> will be waiting for a subsequent idle period, in which case the B=0 on
> CPU2 is not only guaranteed to happen but also be visible out, so the
> "read B" on CPU1 will see 0. And that's true even if CPU2 doesn't have
> an explicit memory barrier, because the "RCU idle" state implies that
> it has gone through a barrier.

I think the reasoning in one direction is actually quite a bit less
obvious than that.

rcu_read_unlock() does *not* necessarily imply a memory barrier (so the
B=0 can actually move logically outside the rcu_read_unlock()), but
synchronize_rcu() *does* imply (and enforce) that a memory barrier has
occurred on all CPUs as part of quiescence.  However, likewise,
rcu_read_lock() doesn't imply anything in particular about writes; it
does enforce either that reads can't leak earlier or that if they do a
synchronize_rcu() will still wait for them, but I don't think the safety
interaction between a *write* in the RCU reader and a *read* in the RCU
writer necessarily follows from that enforcement.

(Also, to the best of my knowledge, you don't even need a barrier on
CPU1; synchronize_rcu() should imply one.)

If synchronize_rcu() on CPU1 sees rcu_read_lock() on CPU2, then
synchronize_rcu() will wait for CPU2's read-side critical section and a
memory barrier before reading B, so CPU1 will see B==0.

The harder direction: If synchronize_rcu() on CPU1 does not see
rcu_read_lock() on CPU2, then it won't necessarily wait for anything,
and since rcu_read_lock() itself does not imply any CPU write barriers,
it's not at all obvious that rcu_read_lock() prevents B=0 from occurring
before CPU1's read of B.

In short, the interaction between RCU's ordering guarantees and CPU
memory barriers in the presence of writes on the read side and reads on
the write side does not seem sufficiently clear to support the portable
use of the above pattern without an smp_wmb() on CPU2 between
rcu_read_lock() and B=0.  I think it might happen to work with the
current implementations of RCU (with which synchronize_rcu() won't
actually notice a quiescent state and return until after either
the rcu_read_unlock() or a preemption point), but by the strict semantic
guarantees of the RCU primitives I think you could write a legitimate
RCU implementation that would break the above code.

That said, I believe this pattern *will* work with every existing
implementation of RCU.  Thus, I'd suggest documenting it as a warning to
prospective RCU optimizers to avoid breaking the above pattern.

- Josh Triplett

  parent reply	other threads:[~2013-10-03 23:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-03  6:20 [PATCH 17/17] RCU'd vfsmounts Al Viro
     [not found] ` <CA+55aFzeDP6J4ekdn4-85yoXzX3xmEp_qc3npvqepJM+MFn=6Q@mail.gmail.com>
     [not found]   ` <20131003105130.GE13318@ZenIV.linux.org.uk>
     [not found]     ` <CA+55aFzh+n_2fs=aWcT_5gnLC_pWSHqQPJeQ+fg=+Xw7ib9=dQ@mail.gmail.com>
     [not found]       ` <20131003174439.GG13318@ZenIV.linux.org.uk>
2013-10-03 19:06         ` Linus Torvalds
2013-10-03 19:43           ` Al Viro
2013-10-03 20:19             ` Linus Torvalds
2013-10-03 20:41               ` Al Viro
2013-10-03 20:52                 ` Linus Torvalds
2013-10-03 21:14                   ` Al Viro
2013-10-04  2:53                     ` Al Viro
2013-10-04  8:37                       ` Christoph Hellwig
2013-10-04 12:58                         ` Al Viro
2013-10-04 14:00                           ` Christoph Hellwig
2013-10-03 23:28                   ` Josh Triplett [this message]
2013-10-03 23:51                     ` Linus Torvalds
2013-10-04  0:41                       ` Josh Triplett
2013-10-04  0:45                         ` Linus Torvalds
2013-10-04  6:41                           ` Ingo Molnar
2013-10-04  5:29                     ` Paul E. McKenney
2013-10-04  6:03                       ` Josh Triplett
2013-10-04  6:15                         ` Paul E. McKenney
2013-10-04  7:04                           ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131003232826.GA6604@jtriplet-mobl1 \
    --to=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).