All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Dave Jones <davej@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: rcu-torture boot hang
Date: Thu, 14 Jun 2012 13:35:47 +0800	[thread overview]
Message-ID: <20120614053547.GA10745@localhost> (raw)
In-Reply-To: <20120614025504.GE30024@home.goodmis.org>

Hi Steven,

On Wed, Jun 13, 2012 at 10:55:04PM -0400, Steven Rostedt wrote:
> Just a note. Please use my goodmis email and not my Red Hat email. My
> Red Hat email is not checked as often, and I don't usually tag emails
> there as "reply to". I only author with that email to give credit to the
> one that pays me to do the work (and I don't mean just for the LWN
> stats).

OK, got it!

> On Wed, Jun 13, 2012 at 08:49:00PM +0800, Fengguang Wu wrote:
> > Hi Steven,
> > 
> > On Tue, Jun 12, 2012 at 08:11:22PM +0800, Fengguang Wu wrote:
> > > Hi Paul,
> > > 
> > > In kernel boot tests with the attached config, I find the 3.5-rc2+
> > > kernels all hang here:
> > > 
> > > [    6.522546] rcu-torture:--- Start of test: nreaders=2
> > > nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0
> > > shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0
> > > fqs_stutter=3 test_boost=1/0 test_boost_interval=7
> > > test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
> > 
> > It turns out that commit 5963e317b1e9d2a4511503916d8fd664bb8fa8fb
> > ("ftrace/x86: Do not change stacks in DEBUG when calling lockdep") is
> > the root cause of this boot hang. The commit is reported by git bisect
> > and confirmed to fix the boot hang by reverting in on 3.5-rc2.
> 
> I get the hang with your config too. If I disable both LOCKDEP and
> IRQSOFF_TRACER which turns off TRACE_IRQFLAGS then the lockup goes away.
> This is consistent with your findings, as the patch you found is only
> enabled when TRACE_IRQFLAGS is on.
> 
> > 
> >         [ rcu torture tests ]
> > 
> > ...It would hang here before reverting the commit...
> > 
> >         [    1.611901] Testing tracer function: PASSED
> 
> I did a bit more investigation and found that the problem comes from the
> load_idt() command that is used to avoid reseting the debug stack. When
> the function tracer is enabled it adds breakpoints to all locations that
> it is about to trace in order to convert the nops in the functions into
> calls to the tracer. But if the breakpoint handler is traced, the
> breakpoint in the handler will reset the stack and cause a crash.
> 
> Your above observation is correct. Actually I added a printk in my
> testing and found that the 'Testing trace function:' does output before
> the lockup. I think printk changed recently that prevents it to print
> out right away if a '\n' is not supplied. The function tracer self test
> prints:
> 
>   'Testing tracer function: '
> 
> Runs the test, and when it succeeds it prints 'PASSED\n'. Because the
> first part wasn't printed, we never saw that the function tracer was
> being tested when the lockup occurred.
> 
> This is a major flaw with the new printk(). It hides what is being
> tested. I need to write a patch to create a fflush() for printk.

Yes, it would definitely help if the printk message was flushed!

> Anyway, load_idt() shouldn't be traced, but with PARAVIRT_GUEST it makes
> functions that are even always_inlined not inlined, and allows things
> like native_load_idt() to be traced, and that causes bad things as we
> want to call load_idt to prevent recursion of the int3 handler.
> 
> 
> If it works, can you give me your 'Tested-by'.

It worked, thank you very much!

I apply the patch on linus master (which is confirmed to hang) and
it boots successfully for 3 times:

[    1.450495] Kprobe smoke test started
[    1.459965] Kprobe smoke test passed successfully
[    1.461578] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
[    1.689860] Testing tracer function: PASSED
[    2.047064] Testing dynamic ftrace: PASSED
[    2.378968] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0) (2 1 3 1 3) (2 2 4 1 21) PASSED

Tested-by: Fengguang Wu <wfg@linux.intel.com>

Thanks,
Fengguang

> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index e5834aa..6a6d7ae 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -47,9 +47,9 @@
>   */
>  #if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
>      !defined(CONFIG_OPTIMIZE_INLINING) || (__GNUC__ < 4)
> -# define inline		inline		__attribute__((always_inline))
> -# define __inline__	__inline__	__attribute__((always_inline))
> -# define __inline	__inline	__attribute__((always_inline))
> +# define inline		inline		__attribute__((always_inline)) notrace
> +# define __inline__	__inline__	__attribute__((always_inline)) notrace
> +# define __inline	__inline	__attribute__((always_inline)) notrace
>  #else
>  /* A lot of inline functions can cause havoc with function tracing */
>  # define inline		inline		notrace

  reply	other threads:[~2012-06-14  5:35 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120612012134.GA7706@localhost>
2012-06-12 12:11 ` rcu-torture boot hang Fengguang Wu
2012-06-12 12:30   ` Paul E. McKenney
2012-06-13 12:49   ` Fengguang Wu
2012-06-14  2:55     ` Steven Rostedt
2012-06-14  5:35       ` Fengguang Wu [this message]
2012-06-13 12:39 ` xfs ip->i_lock: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage Fengguang Wu
2012-06-14  1:20   ` Dave Chinner
2012-06-14  1:49     ` [PATCH] mm: add gfp_mask parameter to vm_map_ram() Fengguang Wu
2012-06-14  1:49       ` Fengguang Wu
2012-06-14  1:49       ` Fengguang Wu
2012-06-14  1:49       ` Fengguang Wu
2012-06-14  2:07       ` Minchan Kim
2012-06-14  2:07         ` Minchan Kim
2012-06-14  2:07         ` Minchan Kim
2012-06-14  2:21         ` Tejun Heo
2012-06-14  2:21           ` Tejun Heo
2012-06-14  2:21           ` Tejun Heo
2012-06-14  2:39           ` Minchan Kim
2012-06-14  2:39             ` Minchan Kim
2012-06-14  2:39             ` Minchan Kim
2012-06-14  3:34         ` Dave Chinner
2012-06-14  3:34           ` Dave Chinner
2012-06-14  3:34           ` Dave Chinner
2012-06-14  3:53           ` David Rientjes
2012-06-14  3:53             ` David Rientjes
2012-06-14  3:53             ` David Rientjes
2012-06-14  5:51           ` Minchan Kim
2012-06-14  5:51             ` Minchan Kim
2012-06-14  5:51             ` Minchan Kim
2012-06-14  7:52           ` Andreas Dilger
2012-06-14  7:52             ` Andreas Dilger
2012-06-14  7:52             ` Andreas Dilger
2012-06-14  2:15       ` Dave Chinner
2012-06-14  2:15         ` Dave Chinner
2012-06-14  2:15         ` Dave Chinner
2012-06-14  1:22   ` xfs ip->i_lock: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage Dave Chinner
2012-06-14  1:29     ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120614053547.GA10745@localhost \
    --to=fengguang.wu@intel.com \
    --cc=davej@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.