From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Pranith Kumar <bobby.prani@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
mingo@kernel.org, laijs@cn.fujitsu.com, dipankar@in.ibm.com,
Andrew Morton <akpm@linux-foundation.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com,
edumazet@google.com, darren@dvhart.com, fweisbec@gmail.com,
Oleg Nesterov <oleg@redhat.com>,
sbw@mit.edu
Subject: Re: [PATCH tip/core/rcu 2/3] documentation: Record rcu_dereference() value mishandling
Date: Tue, 29 Apr 2014 08:37:52 -0700 [thread overview]
Message-ID: <20140429153752.GE8754@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAJhHMCATytuGFbq3Sgy8en-D39LA0iBvysE9=Q7j41QKs2+GfA@mail.gmail.com>
On Tue, Apr 29, 2014 at 01:42:13AM -0400, Pranith Kumar wrote:
> Minor nits below:
>
> Other than that Acked-by: Pranith Kumar <bobby.prani@gmail.com>
>
> On Tue, Apr 29, 2014 at 1:04 AM, Andev <debiandev@gmail.com> wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> >
> > Recent LKML discussings (see http://lwn.net/Articles/586838/ and
> > http://lwn.net/Articles/588300/ for the LWN writeups) brought out
> > some ways of misusing the return value from rcu_dereference() that
> > are not necessarily completely intuitive. This commit therefore
> > documents what can and cannot safely be done with these values.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> snip
> > +
> > + o The pointer is never dereferenced after being compared.
> > + Since there are no subsequent dereferences, the compiler
> > + cannot use anything it learned from the comparison
> > + to reorder the non-existent subsequent dereferences.
> > + This sort of comparison occurs frequently when scanning
> > + RCU-protected circular linked lists.
> > +
> > + o The comparison is against a pointer pointer that
>
> duplicate pointer, remove one
Good catch, fixed!
> > + references memory that was initialized "a long time ago."
> > + The reason this is safe is that even if misordering
> > + occurs, the misordering will not affect the accesses
> > + that follow the comparison. So exactly how long ago is
> > + "a long time ago"? Here are some possibilities:
> snip
> > + o All of the accesses following the comparison are stores,
> > + so that a control dependency preserves the needed ordering.
> > + That said, it is easy to get control dependencies wrong.
> > + Please see the "CONTROL DEPENDENCIES" section of
> > + Documentation/memory-barriers.txt for more details.
> > +
> > + o The pointers compared not-equal -and- the compiler does
>
> add in "are " - The pointers compared are not-equal...
Actually, "compared" is a verb here. But that use is a bit obscure, so
taking your suggestion as a bug report. I changed it to read:
The pointers are not equal -and- the compiler does not have
enough information to deduce the value of the pointer.
Fair enough?
> > + not have enough information to deduce the value of the
> > + pointer. Note that the volatile cast in rcu_dereference()
> > + will normally prevent the compiler from knowing too much.
> > +
> > +o Disable any value-speculation optimizations that your compiler
> > + might provide, especially if you are making use of feedback-based
> > + optimizations that take data collected from prior runs. Such
> > + value-speculation optimizations reorder operations by design.
> > +
> > + There is one exception to this rule: Value-speculation
> > + optimizations that leverage the branch-prediction hardware are
> > + safe on strongly ordered systems (such as x86), but not on weakly
> > + ordered systems (such as ARM or Power). Choose your compiler
> > + command-line options wisely!
> > +
> > +
> > +EXAMPLE OF AMPLIFIED RCU-USAGE BUG
> > +
> > +Because updaters can run concurrently with RCU readers, RCU readers can
> > +see stale and/or inconsistent values. If RCU readers need fresh or
> > +consistent values, which they sometimes do, they need to take proper
> > +precautions. To see this, consider the following code fragment:
> > +
> > + struct foo {
> > + int a;
> > + int b;
> > + int c;
> > + };
> > + struct foo *gp1;
> > + struct foo *gp2;
> > +
> > + void updater(void)
> > + {
> > + struct foo *p;
> > +
> > + p = kmalloc(...);
> > + if (p == NULL)
> > + deal_with_it();
> > + p->a = 42; /* Each field in its own cache line. */
> > + p->b = 43;
> > + p->c = 44;
> > + rcu_assign_pointer(gp1, p);
> > + p->b = 143;
> > + p->c = 144;
> > + rcu_assign_pointer(gp2, p);
> > + }
> > +
> > + void reader(void)
> > + {
> > + struct foo *p;
> > + struct foo *q;
> > + int r1, r2;
> > +
> > + p = rcu_dereference(gp2);
> > + r1 = p->b; /* Guaranteed to get 143. */
> > + q = rcu_dereference(gp1);
> > + if (p == q) {
> > + /* The compiler decides that q->c is same as p->c. */
> > + r2 = p->c; /* Could get 44 on weakly order system. */
> > + }
> > + }
> > +
> > +You might be surprised that the outcome (r1 == 143 && r2 == 44) is possible,
> > +but you should not be. After all, the updater might have been invoked
> > +a second time between the time reader() loaded into "r1" and the time
> > +that it loaded into "r2". The fact that this same result can occur due
> > +to some reordering from the compiler and CPUs is beside the point.
> > +
> > +But suppose that the reader needs a consistent view?
> > +
> > +Then one approach is to use locking, for example, as follows:
> > +
> > + struct foo {
> > + int a;
> > + int b;
> > + int c;
> > + spinlock_t lock;
> > + };
> > + struct foo *gp1;
> > + struct foo *gp2;
> > +
> > + void updater(void)
> > + {
> > + struct foo *p;
> > +
> > + p = kmalloc(...);
> > + if (p == NULL)
> > + deal_with_it();
> > + spin_lock(&p->lock);
> > + p->a = 42; /* Each field in its own cache line. */
> > + p->b = 43;
> > + p->c = 44;
> > + spin_unlock(&p->lock);
> > + rcu_assign_pointer(gp1, p);
> > + spin_lock(&p->lock);
> > + p->b = 143;
> > + p->c = 144;
> > + spin_unlock(&p->lock);
> > + rcu_assign_pointer(gp2, p);
> > + }
> > +
> > + void reader(void)
> > + {
> > + struct foo *p;
> > + struct foo *q;
> > + int r1, r2;
> > +
> > + p = rcu_dereference(gp2);
> > + spin_lock(&p->lock);
> > + r1 = p->b; /* Guaranteed to get 143. */
> > + q = rcu_dereference(gp1);
> > + if (p == q) {
> > + /* The compiler decides that q->c is same as p->c. */
> > + r2 = p->c; /* Could get 44 on weakly order system. */
> > + }
> > + spin_unlock(&p->lock);
> > + }
>
> shouldn't the comment here reflect that r2 can never get 44 and only
> can get 144 once you use a lock?
Indeed it should, good catch, fixed!
I also need to check the load from gp2 for NULL, fixed that too. (If
gp2 is non-NULL, the later load from gp1 is guaranteed to be non-NULL.)
Thanx, Paul
next prev parent reply other threads:[~2014-04-29 15:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJUuVQ7owF11j+Z0dGXkgVUQqUXQw4m1Vu4USHb7xWgAiSKHLw@mail.gmail.com>
2014-04-29 5:42 ` [PATCH tip/core/rcu 2/3] documentation: Record rcu_dereference() value mishandling Pranith Kumar
2014-04-29 15:37 ` Paul E. McKenney [this message]
2014-04-28 23:30 [PATCH tip/core/rcu 0/3] Documentation changes for 3.16 Paul E. McKenney
2014-04-28 23:30 ` [PATCH tip/core/rcu 1/3] documentation: Update sysfs path for rcu_cpu_stall_suppress Paul E. McKenney
2014-04-28 23:30 ` [PATCH tip/core/rcu 2/3] documentation: Record rcu_dereference() value mishandling Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140429153752.GE8754@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=bobby.prani@gmail.com \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=niv@us.ibm.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sbw@mit.edu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox