public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
Date: Sun, 24 Sep 2017 17:34:56 -0700	[thread overview]
Message-ID: <20170925003456.GA13412@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170925002653.GL3521@linux.vnet.ibm.com>

On Sun, Sep 24, 2017 at 05:26:53PM -0700, Paul E. McKenney wrote:
> On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
> > On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > Mostly just paranoia on my part.  I would be happy to remove it if
> > > you prefer.  Or you or Steve can do so if that is more convenient.
> > 
> > I really don't think it's warranted. The values are *stable*. There's
> > no subtle lack of locking, or some optimistic access to a value that
> > can change.
> > 
> > The compiler can generate code to read the value fifteen billion
> > times, and it will always get the same value.
> > 
> > Yes, maybe in between the different accesses, an NMI will happen, and
> > the value will be incremented, but then as the NMI exits, it will
> > decrement again, so the code that got interrupted will not actually
> > see the change.
> > 
> > So the READ_ONCE() isn't "paranoia". It's just confusing.
> > 
> > > And yes, consistency would dictate that the uses in rcu_nmi_enter()
> > > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> > > ->dynticks_nmi_nesting.
> > 
> > NO.
> > 
> > That would be just more of that confusion.
> > 
> > That value is STABLE. It's stable even within an NMI handler. The NMI
> > code can read it, modify it, write it back, do a little dance, all
> > without having to care. There's no "_ONCE()" about it - not for the
> > readers, not for the writers, not for _anybody_.
> > 
> > So adding even more READ/WRITE_ONCE() accesses wouldn't be
> > "consistent", it would just be insanity.
> > 
> > Now, if an NMI happens and the value would be different on entry than
> > it is on exit, that would be something else. Then it really wouldn't
> > be stable wrt random users. But that would also be a major bug in the
> > NMI handler, as far as I can tell.
> > 
> > So the reason I'm objecting to that READ_ONCE() is that it isn't
> > "paranoia", it's "voodoo programming". And we don't do voodoo
> > programming.
> 
> I already agreed that the READ_ONCE() can be removed.

And for whatever it is worth, here is the updated patch.

							Thanx, Paul

------------------------------------------------------------------------

commit 3e2baa988b9c13095995c46c51e0e32c0b6a7d43
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Fri Sep 22 13:14:42 2017 -0700

    rcu: Allow for page faults in NMI handlers
    
    A number of architecture invoke rcu_irq_enter() on exception entry in
    order to allow RCU read-side critical sections in the exception handler
    when the exception is from an idle or nohz_full CPU.  This works, at
    least unless the exception happens in an NMI handler.  In that case,
    rcu_nmi_enter() would already have exited the extended quiescent state,
    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
    to think that it is again in an extended quiescent state.  This will
    in turn result in lockdep splats in response to later RCU read-side
    critical sections.
    
    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
    take no action if there is an rcu_nmi_enter() in effect, thus avoiding
    the unscheduled return to RCU quiescent state.  This in turn should
    make the kernel safe for on-demand RCU voyeurism.
    
    Reported-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Remove READ_ONCE() per Linux Torvalds feedback. ]

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index db5eb8c3f7af..e4fe06d42385 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -891,6 +891,11 @@ void rcu_irq_exit(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     rdtp->dynticks_nesting < 1);
 	if (rdtp->dynticks_nesting <= 1) {
@@ -1036,6 +1041,11 @@ void rcu_irq_enter(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&

  reply	other threads:[~2017-09-25  0:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
2017-09-24 19:42   ` Linus Torvalds
2017-09-25  0:03     ` Paul E. McKenney
2017-09-25  0:12       ` Linus Torvalds
2017-09-25  0:26         ` Paul E. McKenney
2017-09-25  0:34           ` Paul E. McKenney [this message]
2017-09-25  4:41             ` Steven Rostedt
2017-09-25  4:56               ` Paul E. McKenney
2017-09-26  3:19                 ` Paul E. McKenney
2017-09-23 20:56 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-23 20:56 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170925003456.GA13412@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox