linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Michael Ellerman <michael@ellerman.id.au>
Cc: Rojhalat Ibrahim <imr@rtschenk.de>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: Regression in RCU subsystem in latest mainline kernel
Date: Tue, 25 Jun 2013 09:03:32 -0700	[thread overview]
Message-ID: <20130625160332.GA3828@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130625074422.GB29957@concordia>

On Tue, Jun 25, 2013 at 05:44:23PM +1000, Michael Ellerman wrote:
> On Tue, Jun 25, 2013 at 05:19:14PM +1000, Michael Ellerman wrote:
> > 
> > Here's another trace from 3.10-rc7 plus a few local patches.
> 
> And here's another with CONFIG_RCU_CPU_STALL_INFO=y in case that's useful:
> 
> PASS running test_pmc5_6_overuse()
> INFO: rcu_sched self-detected stall on CPU
> 	8: (1 GPs behind) idle=8eb/140000000000002/0 softirq=215/220 

So this CPU has been out of action since before the beginning of the
current grace period ("1 GPs behind").  It is not idle, having taken
a pair of nested interrupts from process context (matching the stack
below).  This CPU has take five softirqs since the last grace period
that it noticed, which makes it likely that the loop is within the
softirq handler.

> 	 (t=2100 jiffies g=18446744073709551583 c=18446744073709551582 q=13)

Assuming HZ=100, this stall has been going on  for 21 seconds.  There
is a grace period in progress according to RCU's global state (which
this CPU is not yet aware of).  There are a total of 13 RCU callbacks
queued across the entire system.

If the system is at all responsive, I suggest using ftrace (either from
the boot command line or at runtime) to trace __do_softirq() and
hrtimer_interrupt().

							Thanx, Paul

> cpu 0x8: Vector: 0  at [c0000003ea03eae0]
>     pc: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
>     lr: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
>     sp: c0000003ea03ec40
>    msr: 9000000000009032
>   current = 0xc0000003ebf9f4a0
>   paca    = 0xc00000000fdc2400	 softe: 0	 irq_happened: 0x00
>     pid   = 2444, comm = power8-events
> enter ? for help
> [c0000003ea03ed70] c000000000094cd0 .update_process_times+0x40/0x90
> [c0000003ea03ee00] c0000000000df050 .tick_sched_handle.isra.13+0x20/0xa0
> [c0000003ea03ee80] c0000000000df2bc .tick_sched_timer+0x5c/0xa0
> [c0000003ea03ef20] c0000000000b3728 .__run_hrtimer+0x98/0x260
> [c0000003ea03efc0] c0000000000b4738 .hrtimer_interrupt+0x138/0x3c0
> [c0000003ea03f0d0] c00000000001cd34 .timer_interrupt+0x124/0x2f0
> [c0000003ea03f180] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
> --- Exception: 901 (Decrementer) at c000000000093ad4 .run_timer_softirq+0x74/0x360
> [c0000003ea03f580] c000000000089ac4 .__do_softirq+0x174/0x350
> [c0000003ea03f6a0] c000000000089ea8 .irq_exit+0xb8/0x100
> [c0000003ea03f720] c00000000001cd68 .timer_interrupt+0x158/0x2f0
> [c0000003ea03f7d0] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
> --- Exception: 901 (Decrementer) at c00000000014a520 .task_function_call+0x60/0x70
> [c0000003ea03fac0] c00000000014a634 .perf_event_enable+0x104/0x1c0 (unreliable)
> [c0000003ea03fb70] c0000000001495ec .perf_event_for_each_child+0x5c/0xf0
> [c0000003ea03fc00] c00000000014cd78 .perf_ioctl+0x108/0x400
> [c0000003ea03fca0] c0000000001d9aa0 .do_vfs_ioctl+0xb0/0x740
> [c0000003ea03fd80] c0000000001da188 .SyS_ioctl+0x58/0xb0
> [c0000003ea03fe30] c000000000009d54 syscall_exit+0x0/0x98
> --- Exception: c01 (System Call) at 00001fffffee03d0
> SP (3ffff5e7cc90) is in userspace
> 
> 
> cheers
> 

  reply	other threads:[~2013-06-25 16:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1626500.7WAVXjfS9F@pcimr>
     [not found] ` <20130614122800.GL5146@linux.vnet.ibm.com>
     [not found]   ` <1645938.As0LR1yeVd@pcimr>
2013-06-14 21:06     ` Regression in RCU subsystem in latest mainline kernel Steven Rostedt
2013-06-15  2:02       ` Benjamin Herrenschmidt
2013-06-15  2:17         ` Steven Rostedt
2013-06-15  2:21           ` Benjamin Herrenschmidt
2013-06-15  2:31             ` Steven Rostedt
2013-06-15  2:51               ` Paul E. McKenney
2013-06-17 13:21           ` Rojhalat Ibrahim
2013-06-17 13:51             ` Steven Rostedt
2013-06-17  7:42         ` Michael Ellerman
2013-06-19  4:09           ` Paul E. McKenney
2013-06-25  7:19             ` Michael Ellerman
2013-06-25  7:36               ` Benjamin Herrenschmidt
2013-06-25  7:44               ` Michael Ellerman
2013-06-25 16:03                 ` Paul E. McKenney [this message]
2013-06-26  8:10                   ` Michael Ellerman
2013-06-26 14:16                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130625160332.GA3828@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=imr@rtschenk.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).