All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>, Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL rcu/next] rcu commits for 2.6.40
Date: Tue, 17 May 2011 15:21:45 -0700	[thread overview]
Message-ID: <20110517222145.GF3818@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110517124341.GA1776@nowhere>

On Tue, May 17, 2011 at 02:43:48PM +0200, Frederic Weisbecker wrote:
> On Tue, May 17, 2011 at 12:53:49AM -0700, Paul E. McKenney wrote:
> > On Tue, May 17, 2011 at 04:40:03AM +0200, Frederic Weisbecker wrote:
> > > On Mon, May 16, 2011 at 02:24:49PM -0700, Paul E. McKenney wrote:
> > > > On Mon, May 16, 2011 at 02:23:29PM +0200, Ingo Molnar wrote:
> > > > > 
> > > > > * Ingo Molnar <mingo@elte.hu> wrote:
> > > > > 
> > > > > > > In the meantime, would you be willing to try out the patch at 
> > > > > > > https://lkml.org/lkml/2011/5/14/89?  This patch helped out Yinghai in 
> > > > > > > several configurations.
> > > > > > 
> > > > > > Wasn't this the one i tested - or is it a new iteration?
> > > > > > 
> > > > > > I'll try it in any case.
> > > > > 
> > > > > oh, this was a new iteration, mea culpa!
> > > > > 
> > > > > And yes, it solves all problems for me as well. Mind pushing it as a fix? :-)
> > > > 
> > > > ;-)
> > > > 
> > > > Unfortunately, the only reason I can see that it works is (1) there
> > > > is some obscure bug in my code or (2) someone somewhere is failing to
> > > > call irq_exit() on some interrupt-exit path.  Much as I might be tempted
> > > > to paper this one over, I believe that we do need to find whatever the
> > > > underlying bug is.
> > > > 
> > > > Oh, yes, there is option (3) as well: maybe if an interrupt deschedules
> > > > a process, the final irq_exit() is omitted in favor of rcu_enter_nohz()?
> > > > But I couldn't see any evidence of this in my admittedly cursory scan
> > > > of the x86 interrupt-handling code.
> > > > 
> > > > So until I learn differently, I am assuming that each and every
> > > > irq_enter() has a matching call to irq_exit(), and that rcu_enter_nohz()
> > > > is called after the final irq_exit() of a given burst of interrupts.
> > > > 
> > > > If my assumptions are mistaken, please do let me know!
> > > 
> > > So it would be nice to have a trace of the calls to rcu_irq_*() / rcu_*_nohz()
> > > before the unpairing happened.
> > > 
> > > I have tried to reproduce it but couldn't trigger anything.
> > > 
> > > So it would be nice if Yinghai can test the patch below, since he was able
> > > to trigger the warning.
> > > 
> > > This is essentially Paul's patch but with stacktrace of the calls recorded.
> > > Then the whole trace is dumped on the console when one of the WARN_ON_ONCE
> > > sanity check is positive. Beware as the trace will be dumped everytime
> > > WARN_ON_ONCE() is positive. So the first dump is enough, you can ignore the
> > > rest.
> > > 
> > > This requires CONFIG_TRACING. May be a good thing to boot with
> > > "ftrace=nop" parameter, so that ftrace will set up a long enough buffer
> > > to have an interesting trace.
> > 
> > Very cool, thank you!!!  I was going to do something like this next,
> > but given my lack of familiarity with tracing, your patch looks much
> > nicer than mine would have been.
> > 
> > It applies fine on top of tip/core/rcu and builds OK.  I cannot reproduce
> > the problem, either, so I am hoping that either Yinghai or Ingo can
> > run this, and hopefully doing so will provide some enlightenment.
> > 
> > I have pushed this as:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git diag.2011.05.16b
> > 
> > I also #ifdefed out the bodies of rcu_nmi_enter() and rcu_nmi_exit()
> > to match the earlier patches.
> > 
> > > PS: the first check in rcu_nmi_enter() doesn't seem to make sense.
> > 
> > Here is what it is doing:
> > 
> > o	rdtp->dynticks_nmi_nesting == 0:
> > 
> > 	Is this is the first-level NMI?  In theory this should always
> > 	be true, but I don't trust NMIs to mask each other.  I have seen
> > 	many systems where NMIs could interrupt other NMIs.
> > 
> > 	The idea is that if we already recorded one level of NMI, we
> > 	had better record them all so we can figure out when we exit
> > 	the last level of NMI handler.
> > 
> > o	atomic_read(&rdtp->dynticks) & 0x1):
> > 
> > 	Did the NMI interrupt a non-dyntick code segment?  If we did,
> > 	then there is no need to tell RCU anything -- RCU is already
> > 	paying attention to this CPU anyway due to the fact that the
> > 	interrupted code segment was not in dyntick mode.
> 
> In fact I was rather referring to your last added check:
> 
> 	if (rdtp->dynticks_nmi_nesting == 0 &&
> -	    (atomic_read(&rdtp->dynticks) & 0x1))
> +	    (atomic_read(&rdtp->dynticks) & 0x1)) {
> +		WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
>  		return;
> +	}

Yes, a bit redundant, but if some other CPU is messing with
rdtp->dynticks, this WARN_ON_ONCE() might catch it.

> > Again, thank you for adding the tracing!
> 
> No problem, I hope it will work as I couldn't test it myself.

Me too!

							Thanx, Paul

  reply	other threads:[~2011-05-17 22:21 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-08 15:18 [GIT PULL rcu/next] rcu commits for 2.6.40 Paul E. McKenney
2011-05-09  7:36 ` Ingo Molnar
2011-05-09 21:09   ` Yinghai Lu
2011-05-10  8:56     ` Paul E. McKenney
2011-05-10  9:37       ` Ingo Molnar
2011-05-10 18:04       ` Yinghai Lu
2011-05-10 19:32         ` Paul E. McKenney
2011-05-10 20:52           ` Yinghai Lu
2011-05-11  4:54             ` Paul E. McKenney
2011-05-11  6:03               ` Yinghai Lu
2011-05-11  6:42               ` Yinghai Lu
2011-05-11 20:13                 ` Paul E. McKenney
2011-05-11 16:54               ` Yinghai Lu
2011-05-11 16:56               ` Yinghai Lu
2011-05-11 20:18                 ` Paul E. McKenney
2011-05-11 20:59                   ` Yinghai Lu
2011-05-11 21:30                     ` Yinghai Lu
2011-05-11 23:02                       ` Yinghai Lu
2011-05-12  6:03                         ` Ingo Molnar
2011-05-12  7:27                           ` Yinghai Lu
2011-05-12  7:42                             ` Yinghai Lu
2011-05-12  9:20                               ` Paul E. McKenney
2011-05-12 17:31                                 ` Yinghai Lu
2011-05-12 21:36                                 ` Yinghai Lu
2011-05-13  1:28                                   ` Yinghai Lu
2011-05-13  8:42                                     ` Ingo Molnar
2011-05-13 12:19                                       ` Ingo Molnar
2011-05-13 13:04                                         ` Ingo Molnar
2011-05-13 13:12                                           ` Ingo Molnar
2011-05-13 14:14                                             ` Paul E. McKenney
2011-05-13 15:07                                               ` Ingo Molnar
2011-05-13 16:26                                                 ` Paul E. McKenney
2011-05-16  7:08                                                   ` Ingo Molnar
2011-05-16  7:48                                                     ` Paul E. McKenney
2011-05-16 11:51                                                       ` Ingo Molnar
2011-05-16 12:23                                                         ` Ingo Molnar
2011-05-16 14:30                                                           ` Ingo Molnar
2011-05-16 21:33                                                             ` Paul E. McKenney
2011-05-16 22:07                                                               ` Paul E. McKenney
2011-05-16 21:24                                                           ` Paul E. McKenney
2011-05-16 23:52                                                             ` Frederic Weisbecker
2011-05-17  2:40                                                             ` Frederic Weisbecker
2011-05-17  7:53                                                               ` Paul E. McKenney
2011-05-17 12:43                                                                 ` Frederic Weisbecker
2011-05-17 22:21                                                                   ` Paul E. McKenney [this message]
2011-05-18 21:10                                                               ` Yinghai Lu
2011-05-18 23:13                                                                 ` Frederic Weisbecker
2011-05-19  4:33                                                                   ` Yinghai Lu
2011-05-19 14:47                                                                     ` Frederic Weisbecker
2011-05-19 19:51                                                                       ` Yinghai Lu
2011-05-19 21:15                                                                         ` Frederic Weisbecker
2011-05-19 21:45                                                                           ` Yinghai Lu
2011-05-20  0:09                                                                             ` [PATCH] rcu: Fix unpaired rcu_irq_enter() from locking selftests Frederic Weisbecker
2011-05-20  8:36                                                                               ` Ingo Molnar
2011-05-20 15:12                                                                                 ` Paul E. McKenney
2011-05-20 15:11                                                                               ` Paul E. McKenney
2011-05-20  0:14                                                                             ` [GIT PULL rcu/next] rcu commits for 2.6.40 Frederic Weisbecker
2011-05-13 14:40                                             ` Ingo Molnar
2011-05-13 16:38                                               ` Paul E. McKenney
2011-05-16  7:10                                                 ` Ingo Molnar
2011-05-13 21:08                                   ` Yinghai Lu
2011-05-14 14:26                                     ` Paul E. McKenney
2011-05-14 15:31                                       ` Paul E. McKenney
2011-05-14 18:34                                         ` Paul E. McKenney
2011-05-15  3:59                                           ` Yinghai Lu
2011-05-15  4:14                                           ` Yinghai Lu
2011-05-15  5:41                                             ` Yinghai Lu
2011-05-15  5:49                                               ` Yinghai Lu
2011-05-15  6:04                                                 ` Paul E. McKenney
2011-05-15  6:59                                                   ` Paul E. McKenney
2011-05-16  7:08                                                     ` Paul E. McKenney
2011-05-16  7:39                                                       ` Ingo Molnar
2011-05-15  6:01                                               ` Paul E. McKenney
2011-05-15 22:01                                           ` Frederic Weisbecker
2011-05-16  5:56                                             ` Paul E. McKenney
2011-05-16 22:40                                               ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110517222145.GF3818@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.