Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, ngo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
Date: Fri, 22 Sep 2017 23:07:37 -0700	[thread overview]
Message-ID: <20170923060737.GC3521@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170922212739.6d7d4d01@gandalf.local.home>

On Fri, Sep 22, 2017 at 09:27:39PM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 15:54:55 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Sep 22, 2017 at 06:15:47PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > While debugging some RCU issues with the stack tracer, it was discovered
> > > that the problem was much more than with the stack tracer itself, but with
> > > the saving of the stack trace, which could happen from any WARN() as well.
> > > The problem was fixed within kernel_text_address().
> > > 
> > > One of the bugs that was discovered was that the stack tracer called
> > > rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
> > > as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
> > > watching from within the stack tracer, since the core issue has been fixed
> > > (within save_stack_trace()), we can simply remove all the logic in the stack
> > > tracer that deals with RCU work arounds.  
> > 
> > I must confess that I am having some difficulty parsing this paragraph,
> > especially the last sentence...
> > 
> > Does this capture it?
> > 
> > 	One problem is that the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic if RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter.	In that case, rcu_irq_enter() actually switches
> 
> I thought the rcu_irq_enter() after rcu_nmi_enter() was a separate bug.
> Your original complaint was that I called rcu_irq_enter()
> unconditionally, and wanted me to only call it if RCU wasn't watching.
> 
> But, the new code could possibly have this get called after
> rcu_nmi_enter() because we are calling it without in_nmi() being set.

You are correct.

The initial bug was rcu_irq_enter() being invoked recursively due
to tracing of some of its called functions, which caused RCU to not
have been watching when it should have.  I then advised you to switch
to rcu_nmi_enter(), unaware that there would be calls to rcu_irq_enter()
between the rcu_nmi_enter() and its matching rcu_nmi_exit().  Which
led us to the second problem, described in my suggested paragraph.

> > 	RCU back to the not-watching state for this CPU, which results
> > 	in lockdep splats complaining about rcu_read_lock() being
> > 	used on an idle (not-watched) CPU.  The first patch of this
> > 	series addressed this problem by having rcu_irq_enter() and
> > 	rcu_irq_exit() refrain from doing anything when rcu_nmi_enter()
> > 	caused RCU to start watching this CPU.	The third patch in this
> > 	series caused save_stack_trace() to invoke rcu_nmi_enter() and
> > 	rcu_nmi_exit() as needed, so this fourth patch now removes the
> > 	rcu_irq_enter() and rcu_irq_exit() from within the stack tracer.
> > 
> > One further question...  Can I now remove the rcu_irq_enter_disabled()
> > logic?
> 
> After this goes in. Yes. But that doesn't need to be a stable change.

Good, at least a little simplification out of this.  ;-)

> > > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Cc: stable@vger.kernel.org
> > > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > > Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> > 
> > With the hard-to-parse paragraph fixed:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Thanks, but we may need to go back and forth a bit to get the change
> log correct.
> 
> Remember the first bug. The one that was fixed by changing
> rcu_irq_enter() to rcu_nmi_enter()?
> 
> =============================
>  WARNING: suspicious RCU usage
>  4.13.0-rc7-test+ #117 Tainted: G        W      
>  -----------------------------
>  /work/git/linux-trace.git/arch/x86/kernel/traps.c:305 entry code didn't wake RCU!
> 
>  other info that might help us debug this:
> 
> 
>  RCU used illegally from idle CPU!
>  rcu_scheduler_active = 2, debug_locks = 1
>  RCU used illegally from extended quiescent state!
>  no locks held by swapper/1/0.
> 
>  stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.13.0-rc7-test+ #117
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
>  Call Trace:
>   dump_stack+0x86/0xcf
>   lockdep_rcu_suspicious+0xc5/0x100
>   do_error_trap+0x125/0x130
>   ? do_error_trap+0x5/0x130
>   ? trace_hardirqs_off_thunk+0x1a/0x1c
>   ? do_invalid_op+0x5/0x30
>   do_invalid_op+0x20/0x30
>   invalid_op+0x1e/0x30
>  RIP: 0010:module_assert_mutex_or_preempt+0x34/0x40
>  RSP: 0018:ffffc900006abc58 EFLAGS: 00010046
>  RAX: 0000000000000000 RBX: ffffffffa000a077 RCX: 0000000000000002
>  RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000046
>  RBP: ffffc900006abc58 R08: ffffc900006abf40 R09: 0000000000000000
>  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>  R13: 0000000000000000 R14: ffff8801188d8040 R15: ffffffff81ed5720
>   ? 0xffffffffa000a077
>   ? module_assert_mutex_or_preempt+0x30/0x40
>   __module_address+0x2c/0xf0
>   ? 0xffffffffa000a077
>   __module_text_address+0x12/0x60
>   ? 0xffffffffa000a077
>   is_module_text_address+0x1f/0x50
>   ? 0xffffffffa000a077
>   __kernel_text_address+0x30/0x90
>   unwind_get_return_address+0x1f/0x30
>   __save_stack_trace+0x83/0xd0
>   ? 0xffffffffa000a077
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   save_stack_trace+0x1b/0x20
>   check_stack+0xf8/0x2f0
>   ? rcu_dynticks_eqs_enter+0x30/0x30
>   stack_trace_call+0x6e/0x80
>   0xffffffffa000a077
>   ? ftrace_graph_caller+0x78/0xa8
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_idle_exit+0xdf/0xf0
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   ? rcu_idle_exit+0xdf/0xf0
>   do_idle+0x128/0x200
>   cpu_startup_entry+0x1d/0x20
>   start_secondary+0x108/0x130
>   secondary_startup_64+0x9f/0x9f
> 
> This was caused by just using rcu_irq_enter(). Not sure if this will
> still be an issue or not. But because we now add an rcu_nmi_enter()
> without being in_nmi(), we probably should do this. This code doesn't
> run if in_nmi() is true, but it could run from the stack trace dump
> itself, and that now calls rcu_nmi_enter().

OK, how about the following?

	It turns out that functions called from rcu_irq_enter() can
	be subject to various kinds of tracing, which can result in
	rcu_irq_enter() being invoked recursively.  This recursion
	causes RCU to not have been watching when it should have,
	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
	to rcu_nmi_enter() still resulted in failures because of calls
	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
	rcu_nmi_exit().  Such calls again cause RCU to not be watching
	when it should have been.

	In particular, the stack tracer called rcu_irq_enter()
	unconditionally, which is problematic when RCU's last
	not-watching-to-watching transition was carried out by
	rcu_nmi_enter(), as will be the case when tracing uses
	rcu_nmi_enter() to cause RCU to start watching the current CPU.
	In that case, rcu_irq_enter() actually switches RCU back to
	the not-watching state for this CPU, which results in lockdep
	splats complaining about rcu_read_lock() being used on an idle
	(not-watched) CPU.  The first patch of this series addressed
	this problem by having rcu_irq_enter() and rcu_irq_exit()
	refrain from doing anything when rcu_nmi_enter() caused RCU to
	start watching this CPU.  The third patch in this series caused
	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
	as needed, so this fourth patch now removes the rcu_irq_enter()
	and rcu_irq_exit() from within the stack tracer.

> Actually, thinking about this more, this doesn't need to go in stable.
> As recursive rcu_irq_enter() calls should not hurt, and you now allow
> rcu_irq_enter() to be called even after a rcu_nmi_enter() right?

Yes, it is now the case that rcu_irq_enter() can be called even after
an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
this and takes an early exit if so.

But what is it about older kernels prevents the tracing-induced recursive
calls to rcu_irq_enter()?

							Thanx, Paul

next prev parent reply	other threads:[~2017-09-23  6:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170922221543.900812109@goodmis.org>
2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-22 22:40   ` Paul E. McKenney
2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-22 22:28   ` Josh Poimboeuf
2017-09-23  1:12     ` [PATCH 3/4 v2] " Steven Rostedt
2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
2017-09-23  1:09     ` Steven Rostedt
2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
2017-09-22 22:54   ` Paul E. McKenney
2017-09-23  1:27     ` Steven Rostedt
2017-09-23  6:07       ` Paul E. McKenney [this message]
2017-09-23 11:22         ` Steven Rostedt
2017-09-23 17:15           ` Paul E. McKenney
     [not found] <20170923205621.811805556@goodmis.org>
2017-09-23 20:56 ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170923060737.GC3521@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).