All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: WARNING: at kernel/rcutorture.c:1243 rcu_torture_printk
Date: Wed, 12 Jun 2013 22:14:57 +0800	[thread overview]
Message-ID: <20130612141457.GA25556@localhost> (raw)
In-Reply-To: <20130612134835.GA25093@linux.vnet.ibm.com>

On Wed, Jun 12, 2013 at 06:48:35AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 11, 2013 at 04:37:25PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 11, 2013 at 04:03:22PM -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 11, 2013 at 10:14:55AM +0800, Fengguang Wu wrote:
> > > > Paul,
> > > > 
> > > > On Mon, Jun 10, 2013 at 07:51:10AM -0700, Paul E. McKenney wrote:
> > > > > On Mon, Jun 10, 2013 at 03:47:28PM +0800, Fengguang Wu wrote:
> > > > > > Greetings,
> > > > > > 
> > > > > > I got the below dmesg and the first bad commit is
> > > > > > 
> > > > > > commit 911af505ef407c2511106c224dd640f882f0f590
> > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > Date:   Mon Feb 11 10:23:27 2013 -0800
> > > > > > 
> > > > > >     rcu: Provide compile-time control for no-CBs CPUs
> > > > > >     
> > > > > >     Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs
> > > > > >     kernel command-line parameter.  This is inconvenient in some cases,
> > > > > >     particularly for randconfig testing, so this commit adds a new set of
> > > > > >     kernel configuration parameters.  CONFIG_RCU_NOCB_CPU_NONE (the default)
> > > > > >     retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback
> > > > > >     processing from CPU 0 (along with any other CPUs specified by the
> > > > > >     rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads
> > > > > >     callback processing from all CPUs.
> > > > > >     
> > > > > >     Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > > > 
> > > > > > However I guess it's a wrong bisect, the commit should be unmasking an
> > > > > > old bug because it merely provides more kconfig options.
> > > > > > 
> > > > > > This warning happened only once in 74 boots of the kernel.
> > > > > > 
> > > > > > I also attached the 2nd dmesg which shows boot hang for the same
> > > > > > kernel.
> > > > > > 
> > > > > > Note that there are also 2 kernel hangs in the 74 boots.  It should
> > > > > > not be an RCU problem, however for completeness, I also attach a dmesg
> > > > > > for your reference when considering this RCU warning.
> > > > > 
> > > > > Either way, that is a nasty warning!  You get this on recent kernels as
> > > > > well, I take it?
> > > > 
> > > > Yes, the warning is still in recent kernels, indicated from the last
> > > > lines of the bisect log.
> > > > 
> > > > 911af505ef407c2511106c224dd640f882f0f590 is bad and its parent
> > > > 34ed62461ae4970695974afb9a60ac3df0086830 is good:
> > > > > > git bisect  bad 911af505ef407c2511106c224dd640f882f0f590  #    65  2013-06-07 16:14:41  rcu: Provide compile-time control for no-CBs CPUs
> > > > > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830  #   900  2013-06-08 04:57:34  rcu: Remove restrictions on no-CBs CPUs
> > > > 
> > > > After bisect, the script tries to confirm that
> > > > 34ed62461ae4970695974afb9a60ac3df0086830 is really good by boot
> > > > testing it 2700 more times:
> > > > 
> > > > > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830  #  2700  2013-06-10 14:33:56  rcu: Remove restrictions on no-CBs CPUs
> > > > 
> > > > And continue to find out whether the linus/master and
> > > > linux-next/master are good/bad (here the results are both bad):
> > > > 
> > > > > > git bisect  bad f43e7a34255a387930bc2bb826ec17c052ce975e  # 14:34      0  Merge branch 'for-next'
> > > > > > git bisect  bad c04efed734409f5a44715b54a6ca1b54b0ccf215  # 14:41     11  Add linux-next specific files for 20130607
> > > 
> > > Just to let you know that I am putting some time into this...
> > > 
> > > Well, I got a panic, but it turned out to be the kernel being unable to
> > > locate init.  Why your .config should have this effect, I have no idea --
> > > I don't see any Kconfig parameters that look to me like they should be
> > > specifying where init lives.  Any hints?

I run randconfig kernels and a smallish ubuntu-core initrd as rootfs.
This must be rare setup and may well trigger weird problems. ;)

> > > Left to myself, my next step will be to try just the RCU-related Kconfig
> > > parameters.
> > 
> > Which gets rid of the "unable to locate init" panic.  One strange thing
> > in your .config:
> > 
> > CONFIG_HZ_PERIODIC=y
> > # CONFIG_NO_HZ_IDLE is not set
> > # CONFIG_NO_HZ_FULL is not set
> > CONFIG_NO_HZ=y
> > 
> > Seems harmless, though.
> > 
> > I will do repeated runs with this config fragment to see if I can reproduce
> > your warning:
> > 
> > 	CONFIG_TREE_PREEMPT_RCU=y
> > 	CONFIG_PREEMPT_RCU=y
> > 	CONFIG_RCU_STALL_COMMON=y
> > 	CONFIG_RCU_USER_QS=n
> > 	CONFIG_RCU_FANOUT=64
> > 	CONFIG_RCU_FANOUT_LEAF=16
> > 	CONFIG_RCU_FANOUT_EXACT=n
> > 	CONFIG_RCU_BOOST=y
> > 	CONFIG_RCU_BOOST_PRIO=1
> > 	CONFIG_RCU_BOOST_DELAY=500
> > 	CONFIG_RCU_NOCB_CPU=y
> > 	CONFIG_RCU_NOCB_CPU_NONE=n
> > 	CONFIG_RCU_NOCB_CPU_ZERO=y
> > 	CONFIG_RCU_NOCB_CPU_ALL=n
> > 	CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> > 	CONFIG_SPARSE_RCU_POINTER=n
> > 	CONFIG_RCU_TORTURE_TEST=y
> > 	CONFIG_RCU_TORTURE_TEST_RUNNABLE=y
> > 	CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > 	CONFIG_RCU_CPU_STALL_VERBOSE=y
> > 	CONFIG_RCU_CPU_STALL_INFO=y
> > 	CONFIG_RCU_TRACE=n
> > 	CONFIG_NR_CPUS=8
> > 	CONFIG_HZ_PERIODIC=y
> > 	CONFIG_NO_HZ_IDLE=n
> > 	CONFIG_NO_HZ_FULL=n
> > 	CONFIG_NO_HZ=y
> > 	CONFIG_PREEMPT_NONE=n
> > 	CONFIG_PREEMPT_VOLUNTARY=n
> > 	CONFIG_PREEMPT=y
> > 	CONFIG_PREEMPT_COUNT=y
> 
> And my system completed 200 boots with two minutes of rcutorture without
> failure.  Any thoughts on what I should try next?

Thanks for looking into this and testing it out!
I just kicked off a new bisect and hope to collect more information
about it.

Thanks,
Fengguang

  reply	other threads:[~2013-06-12 14:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-10  7:47 WARNING: at kernel/rcutorture.c:1243 rcu_torture_printk Fengguang Wu
2013-06-10 14:51 ` Paul E. McKenney
2013-06-11  2:14   ` Fengguang Wu
2013-06-11 23:03     ` Paul E. McKenney
2013-06-11 23:37       ` Paul E. McKenney
2013-06-12 13:48         ` Paul E. McKenney
2013-06-12 14:14           ` Fengguang Wu [this message]
2013-06-12 16:40             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130612141457.GA25556@localhost \
    --to=fengguang.wu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.