public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] [GIT PULL] tracing: Fix RCU warnings in stack tracer
@ 2015-10-22 13:20 Steven Rostedt
  2015-10-22 13:20 ` [PATCH 1/2] tracing: Have stack tracer force RCU to be watching Steven Rostedt
  2015-10-22 13:20 ` [PATCH 2/2] tracing: Do not allow stack_tracer to record stack in NMI Steven Rostedt
  0 siblings, 2 replies; 4+ messages in thread
From: Steven Rostedt @ 2015-10-22 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton


Linus,

Running tests on other changes, the system locked up due to lots
of warnings. It was caused by the stack tracer triggering a warning
about using rcu_dereference() when RCU was not watching. This can happen
due to the fact that the stack tracer uses the function tracer to check
each function, and there's functions that may be called and traced
when RCU stopped watching. Namely when a function is called just before
going idle or to userspace and after RCU stopped watching that current
CPU.

The first patch makes sure that RCU is watching when the stack tracer
uses RCU. The second patch is to make sure that the stack tracer does
not get called by functions in NMI, as it's not NMI safe.

Please pull the latest trace-fixes-v4.3-rc6 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v4.3-rc6

Tag SHA1: 520ef22047724f1fae143e504b4da9737e50d32c
Head SHA1: 1904be1b6bb92058c8e00063dd59df2df294e258


Steven Rostedt (Red Hat) (2):
      tracing: Have stack tracer force RCU to be watching
      tracing: Do not allow stack_tracer to record stack in NMI

----
 kernel/trace/trace_stack.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] tracing: Have stack tracer force RCU to be watching
  2015-10-22 13:20 [PATCH 0/2] [GIT PULL] tracing: Fix RCU warnings in stack tracer Steven Rostedt
@ 2015-10-22 13:20 ` Steven Rostedt
  2015-10-29  9:54   ` Paul E. McKenney
  2015-10-22 13:20 ` [PATCH 2/2] tracing: Do not allow stack_tracer to record stack in NMI Steven Rostedt
  1 sibling, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2015-10-22 13:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, stable,
	Peter Zijlstra, Paul E. McKenney

[-- Attachment #1: 0001-tracing-Have-stack-tracer-force-RCU-to-be-watching.patch --]
[-- Type: text/plain, Size: 2068 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The stack tracer was triggering the WARN_ON() in module.c:

 static void module_assert_mutex_or_preempt(void)
 {
 #ifdef CONFIG_LOCKDEP
	if (unlikely(!debug_locks))
		return;

	WARN_ON(!rcu_read_lock_sched_held() &&
		!lockdep_is_held(&module_mutex));
 #endif
 }

The reason is that the stack tracer traces all function calls, and some of
those calls happen while exiting or entering user space and idle. Some of
these functions are called after RCU had already stopped watching, as RCU
does not watch userspace or idle CPUs.

If a max stack is hit, then the save_stack_trace() is called, which will
check module addresses and call module_assert_mutex_or_preempt(), and then
trigger the warning. Sad part is, the warning itself will also do a stack
trace and tigger the same warning. That probably should be fixed.

The warning was added by 0be964be0d45 "module: Sanitize RCU usage and
locking" but this bug has probably been around longer. But it's unlikely to
cause much harm, but the new warning causes the system to lock up.

Cc: stable@vger.kernel.org # 4.2+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index b746399ab59c..5f29402bff0f 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -88,6 +88,12 @@ check_stack(unsigned long ip, unsigned long *stack)
 	local_irq_save(flags);
 	arch_spin_lock(&max_stack_lock);
 
+	/*
+	 * RCU may not be watching, make it see us.
+	 * The stack trace code uses rcu_sched.
+	 */
+	rcu_irq_enter();
+
 	/* In case another CPU set the tracer_frame on us */
 	if (unlikely(!frame_size))
 		this_size -= tracer_frame;
@@ -169,6 +175,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 	}
 
  out:
+	rcu_irq_exit();
 	arch_spin_unlock(&max_stack_lock);
 	local_irq_restore(flags);
 }
-- 
2.6.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] tracing: Do not allow stack_tracer to record stack in NMI
  2015-10-22 13:20 [PATCH 0/2] [GIT PULL] tracing: Fix RCU warnings in stack tracer Steven Rostedt
  2015-10-22 13:20 ` [PATCH 1/2] tracing: Have stack tracer force RCU to be watching Steven Rostedt
@ 2015-10-22 13:20 ` Steven Rostedt
  1 sibling, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2015-10-22 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-tracing-Do-not-allow-stack_tracer-to-record-stack-in.patch --]
[-- Type: text/plain, Size: 1021 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The code in stack tracer should not be executed within an NMI as it grabs
spinlocks and stack tracing an NMI gives the possibility of causing a
deadlock. Although this is safe on x86_64, because it does not perform stack
traces when the task struct stack is not in use (interrupts and NMIs), it
may be an issue for NMIs on i386 and other archs that use the same stack as
the NMI.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 5f29402bff0f..8abf1ba18085 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -85,6 +85,10 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (!object_is_on_stack(stack))
 		return;
 
+	/* Can't do this from NMI context (can cause deadlocks) */
+	if (in_nmi())
+		return;
+
 	local_irq_save(flags);
 	arch_spin_lock(&max_stack_lock);
 
-- 
2.6.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] tracing: Have stack tracer force RCU to be watching
  2015-10-22 13:20 ` [PATCH 1/2] tracing: Have stack tracer force RCU to be watching Steven Rostedt
@ 2015-10-29  9:54   ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2015-10-29  9:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton, stable,
	Peter Zijlstra

On Thu, Oct 22, 2015 at 09:20:45AM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The stack tracer was triggering the WARN_ON() in module.c:
> 
>  static void module_assert_mutex_or_preempt(void)
>  {
>  #ifdef CONFIG_LOCKDEP
> 	if (unlikely(!debug_locks))
> 		return;
> 
> 	WARN_ON(!rcu_read_lock_sched_held() &&
> 		!lockdep_is_held(&module_mutex));
>  #endif
>  }
> 
> The reason is that the stack tracer traces all function calls, and some of
> those calls happen while exiting or entering user space and idle. Some of
> these functions are called after RCU had already stopped watching, as RCU
> does not watch userspace or idle CPUs.
> 
> If a max stack is hit, then the save_stack_trace() is called, which will
> check module addresses and call module_assert_mutex_or_preempt(), and then
> trigger the warning. Sad part is, the warning itself will also do a stack
> trace and tigger the same warning. That probably should be fixed.
> 
> The warning was added by 0be964be0d45 "module: Sanitize RCU usage and
> locking" but this bug has probably been around longer. But it's unlikely to
> cause much harm, but the new warning causes the system to lock up.
> 
> Cc: stable@vger.kernel.org # 4.2+
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

It does look like it will make RCU pay attention in the region of interest.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/trace/trace_stack.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index b746399ab59c..5f29402bff0f 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -88,6 +88,12 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	local_irq_save(flags);
>  	arch_spin_lock(&max_stack_lock);
> 
> +	/*
> +	 * RCU may not be watching, make it see us.
> +	 * The stack trace code uses rcu_sched.
> +	 */
> +	rcu_irq_enter();
> +
>  	/* In case another CPU set the tracer_frame on us */
>  	if (unlikely(!frame_size))
>  		this_size -= tracer_frame;
> @@ -169,6 +175,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	}
> 
>   out:
> +	rcu_irq_exit();
>  	arch_spin_unlock(&max_stack_lock);
>  	local_irq_restore(flags);
>  }
> -- 
> 2.6.1
> 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-29 10:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-22 13:20 [PATCH 0/2] [GIT PULL] tracing: Fix RCU warnings in stack tracer Steven Rostedt
2015-10-22 13:20 ` [PATCH 1/2] tracing: Have stack tracer force RCU to be watching Steven Rostedt
2015-10-29  9:54   ` Paul E. McKenney
2015-10-22 13:20 ` [PATCH 2/2] tracing: Do not allow stack_tracer to record stack in NMI Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox