Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, ngo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
Date: Fri, 22 Sep 2017 23:07:37 -0700	[thread overview]
Message-ID: <20170923060737.GC3521@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170922212739.6d7d4d01@gandalf.local.home>

On Fri, Sep 22, 2017 at 09:27:39PM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 15:54:55 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Sep 22, 2017 at 06:15:47PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > While debugging some RCU issues with the stack tracer, it was discovered
> > > that the problem was much more than with the stack tracer itself, but with
> > > the saving of the stack trace, which could happen from any WARN() as well.
> > > The problem was fixed within kernel_text_address().
> > > 
> > > One of the bugs that was discovered was that the stack tracer called
> > > rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
> > > as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
> > > watching from within the stack tracer, since the core issue has been fixed
> > > (within save_stack_trace()), we can simply remove all the logic in the stack
> > > tracer that deals with RCU work arounds.  
> > 
> > I must confess that I am having some difficulty parsing this paragraph,
> > especially the last sentence...
> > 
> > Does this capture it?
> > 
> > 	One problem is that the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic if RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter.	In that case, rcu_irq_enter() actually switches
> 
> I thought the rcu_irq_enter() after rcu_nmi_enter() was a separate bug.
> Your original complaint was that I called rcu_irq_enter()
> unconditionally, and wanted me to only call it if RCU wasn't watching.
> 
> But, the new code could possibly have this get called after
> rcu_nmi_enter() because we are calling it without in_nmi() being set.

You are correct.

The initial bug was rcu_irq_enter() being invoked recursively due
to tracing of some of its called functions, which caused RCU to not
have been watching when it should have.  I then advised you to switch
to rcu_nmi_enter(), unaware that there would be calls to rcu_irq_enter()
between the rcu_nmi_enter() and its matching rcu_nmi_exit().  Which
led us to the second problem, described in my suggested paragraph.

> > 	RCU back to the not-watching state for this CPU, which results
> > 	in lockdep splats complaining about rcu_read_lock() being
> > 	used on an idle (not-watched) CPU.  The first patch of this
> > 	series addressed this problem by having rcu_irq_enter() and
> > 	rcu_irq_exit() refrain from doing anything when rcu_nmi_enter()
> > 	caused RCU to start watching this CPU.	The third patch in this
> > 	series caused save_stack_trace() to invoke rcu_nmi_enter() and
> > 	rcu_nmi_exit() as needed, so this fourth patch now removes the
> > 	rcu_irq_enter() and rcu_irq_exit() from within the stack tracer.
> > 
> > One further question...  Can I now remove the rcu_irq_enter_disabled()
> > logic?
> 
> After this goes in. Yes. But that doesn't need to be a stable change.

Good, at least a little simplification out of this.  ;-)

> > > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Cc: stable@vger.kernel.org
> > > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > > Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> > 
> > With the hard-to-parse paragraph fixed:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Thanks, but we may need to go back and forth a bit to get the change
> log correct.
> 
> Remember the first bug. The one that was fixed by changing
> rcu_irq_enter() to rcu_nmi_enter()?
> 
> =============================
>  WARNING: suspicious RCU usage
>  4.13.0-rc7-test+ #117 Tainted: G        W      
>  -----------------------------
>  /work/git/linux-trace.git/arch/x86/kernel/traps.c:305 entry code didn't wake RCU!
> 
>  other info that might help us debug this:
> 
> 
>  RCU used illegally from idle CPU!
>  rcu_scheduler_active = 2, debug_locks = 1
>  RCU used illegally from extended quiescent state!
>  no locks held by swapper/1/0.
> 
>  stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.13.0-rc7-test+ #117
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
>  Call Trace:
>   dump_stack+0x86/0xcf
>   lockdep_rcu_suspicious+0xc5/0x100
>   do_error_trap+0x125/0x130
>   ? do_error_trap+0x5/0x130
>   ? trace_hardirqs_off_thunk+0x1a/0x1c
>   ? do_invalid_op+0x5/0x30
>   do_invalid_op+0x20/0x30
>   invalid_op+0x1e/0x30
>  RIP: 0010:module_assert_mutex_or_preempt+0x34/0x40
>  RSP: 0018:ffffc900006abc58 EFLAGS: 00010046
>  RAX: 0000000000000000 RBX: ffffffffa000a077 RCX: 0000000000000002
>  RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000046
>  RBP: ffffc900006abc58 R08: ffffc900006abf40 R09: 0000000000000000
>  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>  R13: 0000000000000000 R14: ffff8801188d8040 R15: ffffffff81ed5720
>   ? 0xffffffffa000a077
>   ? module_assert_mutex_or_preempt+0x30/0x40
>   __module_address+0x2c/0xf0
>   ? 0xffffffffa000a077
>   __module_text_address+0x12/0x60
>   ? 0xffffffffa000a077
>   is_module_text_address+0x1f/0x50
>   ? 0xffffffffa000a077
>   __kernel_text_address+0x30/0x90
>   unwind_get_return_address+0x1f/0x30
>   __save_stack_trace+0x83/0xd0
>   ? 0xffffffffa000a077
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   save_stack_trace+0x1b/0x20
>   check_stack+0xf8/0x2f0
>   ? rcu_dynticks_eqs_enter+0x30/0x30
>   stack_trace_call+0x6e/0x80
>   0xffffffffa000a077
>   ? ftrace_graph_caller+0x78/0xa8
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_idle_exit+0xdf/0xf0
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   ? rcu_idle_exit+0xdf/0xf0
>   do_idle+0x128/0x200
>   cpu_startup_entry+0x1d/0x20
>   start_secondary+0x108/0x130
>   secondary_startup_64+0x9f/0x9f
> 
> This was caused by just using rcu_irq_enter(). Not sure if this will
> still be an issue or not. But because we now add an rcu_nmi_enter()
> without being in_nmi(), we probably should do this. This code doesn't
> run if in_nmi() is true, but it could run from the stack trace dump
> itself, and that now calls rcu_nmi_enter().

OK, how about the following?

	It turns out that functions called from rcu_irq_enter() can
	be subject to various kinds of tracing, which can result in
	rcu_irq_enter() being invoked recursively.  This recursion
	causes RCU to not have been watching when it should have,
	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
	to rcu_nmi_enter() still resulted in failures because of calls
	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
	rcu_nmi_exit().  Such calls again cause RCU to not be watching
	when it should have been.

	In particular, the stack tracer called rcu_irq_enter()
	unconditionally, which is problematic when RCU's last
	not-watching-to-watching transition was carried out by
	rcu_nmi_enter(), as will be the case when tracing uses
	rcu_nmi_enter() to cause RCU to start watching the current CPU.
	In that case, rcu_irq_enter() actually switches RCU back to
	the not-watching state for this CPU, which results in lockdep
	splats complaining about rcu_read_lock() being used on an idle
	(not-watched) CPU.  The first patch of this series addressed
	this problem by having rcu_irq_enter() and rcu_irq_exit()
	refrain from doing anything when rcu_nmi_enter() caused RCU to
	start watching this CPU.  The third patch in this series caused
	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
	as needed, so this fourth patch now removes the rcu_irq_enter()
	and rcu_irq_exit() from within the stack tracer.

> Actually, thinking about this more, this doesn't need to go in stable.
> As recursive rcu_irq_enter() calls should not hurt, and you now allow
> rcu_irq_enter() to be called even after a rcu_nmi_enter() right?

Yes, it is now the case that rcu_irq_enter() can be called even after
an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
this and takes an early exit if so.

But what is it about older kernels prevents the tracing-induced recursive
calls to rcu_irq_enter()?

							Thanx, Paul

next prev parent reply	other threads:[~2017-09-23  6:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-22 22:40   ` Paul E. McKenney
2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-22 22:28   ` Josh Poimboeuf
2017-09-23  1:12     ` [PATCH 3/4 v2] " Steven Rostedt
2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
2017-09-23  1:09     ` Steven Rostedt
2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
2017-09-22 22:54   ` Paul E. McKenney
2017-09-23  1:27     ` Steven Rostedt
2017-09-23  6:07       ` Paul E. McKenney [this message]
2017-09-23 11:22         ` Steven Rostedt
2017-09-23 17:15           ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170923060737.GC3521@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.