All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH] tracing: Have stack tracer force RCU to be watching
Date: Tue, 20 Oct 2015 13:25:28 -0700	[thread overview]
Message-ID: <20151020202528.GF5105@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151020121031.2f9d72c1@gandalf.local.home>

On Tue, Oct 20, 2015 at 12:10:31PM -0400, Steven Rostedt wrote:
> 
> Paul,
> 
> I've spent a couple of days debugging this, and finally found that my
> stack tracer was calling the stack trace code, which calls
> __module_address() which asserts the below.
> 
> Is just calling rcu_irq_enter() and rcu_irq_exit() safe to do
> everywhere (with interrupts always disabled)? This patch appears to fix
> the bug.

Yep!  Just don't call it from an NMI handler.  And don't call it with
interrupts enabled.  The patch looks to have interrupts always disabled,
and the surrounding code doesn't look like NMI-safe code anyway, so
should be OK.

							Thanx, Paul

> Peter,
> 
> I'm going to be sending a second patch that converts that from a
> WARN_ON() to an open coded WARN_ON_ONCE(), because WARN_ON() also calls
> the stack trace code which calls __module_address() and we end up with
> an infinite warning about it. This prevented me from seeing where the
> bug actually was, and crashed the box.
> 
> -- Steve
> 
> 
> 
> From a2d7629048322ae62bff57f34f5f995e25ed234c Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> Date: Tue, 20 Oct 2015 11:38:08 -0400
> Subject: [PATCH] tracing: Have stack tracer force RCU to be watching
> 
> The stack tracer was triggering the WARN_ON() in module.c:
> 
>  static void module_assert_mutex_or_preempt(void)
>  {
>  #ifdef CONFIG_LOCKDEP
> 	if (unlikely(!debug_locks))
> 		return;
> 
> 	WARN_ON(!rcu_read_lock_sched_held() &&
> 		!lockdep_is_held(&module_mutex));
>  #endif
>  }
> 
> The reason is that the stack tracer traces all function calls, and some of
> those calls happen while exiting or entering user space and idle. Some of
> these functions are called after RCU had already stopped watching, as RCU
> does not watch userspace or idle CPUs.
> 
> If a max stack is hit, then the save_stack_trace() is called, which will
> check module addresses and call module_assert_mutex_or_preempt(), and then
> trigger the warning. Sad part is, the warning itself will also do a stack
> trace and tigger the same warning. That probably should be fixed.
> 
> The warning was added by 0be964be0d45 "module: Sanitize RCU usage and
> locking" but this bug has probably been around longer. But it's unlikely to
> cause much harm, but the new warning causes the system to lock up.
> 
> Cc: stable@vger.kernel.org # 4.2+
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_stack.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index b746399ab59c..5f29402bff0f 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -88,6 +88,12 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	local_irq_save(flags);
>  	arch_spin_lock(&max_stack_lock);
>  
> +	/*
> +	 * RCU may not be watching, make it see us.
> +	 * The stack trace code uses rcu_sched.
> +	 */
> +	rcu_irq_enter();
> +
>  	/* In case another CPU set the tracer_frame on us */
>  	if (unlikely(!frame_size))
>  		this_size -= tracer_frame;
> @@ -169,6 +175,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	}
>  
>   out:
> +	rcu_irq_exit();
>  	arch_spin_unlock(&max_stack_lock);
>  	local_irq_restore(flags);
>  }
> -- 
> 1.8.3.1
> 


  reply	other threads:[~2015-10-20 20:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20 16:10 [RFC][PATCH] tracing: Have stack tracer force RCU to be watching Steven Rostedt
2015-10-20 20:25 ` Paul E. McKenney [this message]
2015-10-20 21:32   ` Steven Rostedt
2015-10-20 23:48     ` Paul E. McKenney
2015-10-20 23:55       ` Steven Rostedt
2015-10-21 17:46         ` Steven Rostedt
2015-10-21  8:01     ` Peter Zijlstra
2015-10-21 11:39       ` Steven Rostedt
2015-10-21 11:52         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151020202528.GF5105@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.