From: Fengguang Wu <fengguang.wu@intel.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Dave Jones <davej@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: rcu-torture boot hang
Date: Thu, 14 Jun 2012 13:35:47 +0800 [thread overview]
Message-ID: <20120614053547.GA10745@localhost> (raw)
In-Reply-To: <20120614025504.GE30024@home.goodmis.org>
Hi Steven,
On Wed, Jun 13, 2012 at 10:55:04PM -0400, Steven Rostedt wrote:
> Just a note. Please use my goodmis email and not my Red Hat email. My
> Red Hat email is not checked as often, and I don't usually tag emails
> there as "reply to". I only author with that email to give credit to the
> one that pays me to do the work (and I don't mean just for the LWN
> stats).
OK, got it!
> On Wed, Jun 13, 2012 at 08:49:00PM +0800, Fengguang Wu wrote:
> > Hi Steven,
> >
> > On Tue, Jun 12, 2012 at 08:11:22PM +0800, Fengguang Wu wrote:
> > > Hi Paul,
> > >
> > > In kernel boot tests with the attached config, I find the 3.5-rc2+
> > > kernels all hang here:
> > >
> > > [ 6.522546] rcu-torture:--- Start of test: nreaders=2
> > > nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0
> > > shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0
> > > fqs_stutter=3 test_boost=1/0 test_boost_interval=7
> > > test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
> >
> > It turns out that commit 5963e317b1e9d2a4511503916d8fd664bb8fa8fb
> > ("ftrace/x86: Do not change stacks in DEBUG when calling lockdep") is
> > the root cause of this boot hang. The commit is reported by git bisect
> > and confirmed to fix the boot hang by reverting in on 3.5-rc2.
>
> I get the hang with your config too. If I disable both LOCKDEP and
> IRQSOFF_TRACER which turns off TRACE_IRQFLAGS then the lockup goes away.
> This is consistent with your findings, as the patch you found is only
> enabled when TRACE_IRQFLAGS is on.
>
> >
> > [ rcu torture tests ]
> >
> > ...It would hang here before reverting the commit...
> >
> > [ 1.611901] Testing tracer function: PASSED
>
> I did a bit more investigation and found that the problem comes from the
> load_idt() command that is used to avoid reseting the debug stack. When
> the function tracer is enabled it adds breakpoints to all locations that
> it is about to trace in order to convert the nops in the functions into
> calls to the tracer. But if the breakpoint handler is traced, the
> breakpoint in the handler will reset the stack and cause a crash.
>
> Your above observation is correct. Actually I added a printk in my
> testing and found that the 'Testing trace function:' does output before
> the lockup. I think printk changed recently that prevents it to print
> out right away if a '\n' is not supplied. The function tracer self test
> prints:
>
> 'Testing tracer function: '
>
> Runs the test, and when it succeeds it prints 'PASSED\n'. Because the
> first part wasn't printed, we never saw that the function tracer was
> being tested when the lockup occurred.
>
> This is a major flaw with the new printk(). It hides what is being
> tested. I need to write a patch to create a fflush() for printk.
Yes, it would definitely help if the printk message was flushed!
> Anyway, load_idt() shouldn't be traced, but with PARAVIRT_GUEST it makes
> functions that are even always_inlined not inlined, and allows things
> like native_load_idt() to be traced, and that causes bad things as we
> want to call load_idt to prevent recursion of the int3 handler.
>
>
> If it works, can you give me your 'Tested-by'.
It worked, thank you very much!
I apply the patch on linus master (which is confirmed to hang) and
it boots successfully for 3 times:
[ 1.450495] Kprobe smoke test started
[ 1.459965] Kprobe smoke test passed successfully
[ 1.461578] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
[ 1.689860] Testing tracer function: PASSED
[ 2.047064] Testing dynamic ftrace: PASSED
[ 2.378968] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0) (2 1 3 1 3) (2 2 4 1 21) PASSED
Tested-by: Fengguang Wu <wfg@linux.intel.com>
Thanks,
Fengguang
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index e5834aa..6a6d7ae 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -47,9 +47,9 @@
> */
> #if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
> !defined(CONFIG_OPTIMIZE_INLINING) || (__GNUC__ < 4)
> -# define inline inline __attribute__((always_inline))
> -# define __inline__ __inline__ __attribute__((always_inline))
> -# define __inline __inline __attribute__((always_inline))
> +# define inline inline __attribute__((always_inline)) notrace
> +# define __inline__ __inline__ __attribute__((always_inline)) notrace
> +# define __inline __inline __attribute__((always_inline)) notrace
> #else
> /* A lot of inline functions can cause havoc with function tracing */
> # define inline inline notrace
next prev parent reply other threads:[~2012-06-14 5:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120612012134.GA7706@localhost>
2012-06-12 12:11 ` rcu-torture boot hang Fengguang Wu
2012-06-12 12:30 ` Paul E. McKenney
2012-06-13 12:49 ` Fengguang Wu
2012-06-14 2:55 ` Steven Rostedt
2012-06-14 5:35 ` Fengguang Wu [this message]
2012-06-13 12:39 ` xfs ip->i_lock: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage Fengguang Wu
2012-06-14 1:20 ` Dave Chinner
2012-06-14 1:49 ` [PATCH] mm: add gfp_mask parameter to vm_map_ram() Fengguang Wu
2012-06-14 2:07 ` Minchan Kim
2012-06-14 2:21 ` Tejun Heo
2012-06-14 2:39 ` Minchan Kim
2012-06-14 3:34 ` Dave Chinner
2012-06-14 3:53 ` David Rientjes
2012-06-14 5:51 ` Minchan Kim
2012-06-14 7:52 ` Andreas Dilger
2012-06-14 2:15 ` Dave Chinner
2012-06-14 1:22 ` xfs ip->i_lock: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage Dave Chinner
2012-06-14 1:29 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120614053547.GA10745@localhost \
--to=fengguang.wu@intel.com \
--cc=davej@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox