All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Paul Walmsley <paul@pwsan.com>
Cc: "Hilman, Kevin" <khilman@ti.com>,
	"<snijsure@grid-net.com>" <snijsure@grid-net.com>,
	"Bruce, Becky" <bbruce@ti.com>,
	"<linux-kernel@vger.kernel.org>" <linux-kernel@vger.kernel.org>,
	"Paul E. McKenney" <paul.mckenney@linaro.org>,
	"Shilimkar, Santosh" <santosh.shilimkar@ti.com>,
	"Hunter, Jon" <jon-hunter@ti.com>,
	"<linux-omap@vger.kernel.org>" <linux-omap@vger.kernel.org>,
	"<linux-arm-kernel@lists.infradead.org>"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: rcu self-detected stall messages on OMAP3, 4 boards
Date: Sat, 22 Sep 2012 12:52:53 -0700	[thread overview]
Message-ID: <20120922195253.GD2934@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1209221813330.10663@utopia.booyaka.com>

On Sat, Sep 22, 2012 at 06:16:15PM +0000, Paul Walmsley wrote:
> Hi Paul
> 
> On Fri, 21 Sep 2012, Paul E. McKenney wrote:
> 
> > I am wondering if your system somehow figured out how to start a grace
> > period that had no RCU callbacks waiting for it.  If that happened,
> > then a CONFIG_NO_HZ=y system could in theory get into a state where all
> > CPUs are in dyntick-idle mode, so that none of them is doing anything
> > to force the grace period to complete.
> >
> > That should be easy to diagnose, anyway.  Please see below, which
> > includes the earlier diagnostic patch.
> 
> Here you go.
> 
> - Paul
> 
> [  248.902618] INFO: rcu_sched self-detected stall on CPU
> [  248.905456]  0: (1 ticks this GP) idle=933/1/0 
> [  248.907897]   (t=26570 jiffies g=11 c=10 q=0)

Bingo!!!  (q=0, in case you were wondering.  And thank you for testing this!)

Strangely enough, I believe that I have inadvertently fixed this in
my -rcu tree:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next

Nevertheless, if you get a chance to try it, I would be interested to
hear if my guess is correct.  The trick is that a kthread drives the
grace period in -rcu, regardless of whether or not there are callbacks.

However, the backport would not be something that -stable would be happy
with, so I will be putting together a fix for mainline.  This thing
has been in the kernel since about 2004, not sure why you didn't hit
it earlier.

							Thanx, Paul

> [  248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714)
> [  248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68)
> [  248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec)
> [  248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0)
> [  248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0)
> [  248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44)
> [  248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c)
> [  248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48)
> [  248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac)
> [  248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c)
> [  248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c)
> [  248.965484] Exception stack(0xc0729f58 to 0xc0729fa0)
> [  248.968231] 9f40:                                                       0003b832 00000001
> [  248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092
> [  248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff
> [  248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44)
> [  248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114)
> [  248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304)
> 

WARNING: multiple messages have this Message-ID (diff)
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: rcu self-detected stall messages on OMAP3, 4 boards
Date: Sat, 22 Sep 2012 12:52:53 -0700	[thread overview]
Message-ID: <20120922195253.GD2934@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1209221813330.10663@utopia.booyaka.com>

On Sat, Sep 22, 2012 at 06:16:15PM +0000, Paul Walmsley wrote:
> Hi Paul
> 
> On Fri, 21 Sep 2012, Paul E. McKenney wrote:
> 
> > I am wondering if your system somehow figured out how to start a grace
> > period that had no RCU callbacks waiting for it.  If that happened,
> > then a CONFIG_NO_HZ=y system could in theory get into a state where all
> > CPUs are in dyntick-idle mode, so that none of them is doing anything
> > to force the grace period to complete.
> >
> > That should be easy to diagnose, anyway.  Please see below, which
> > includes the earlier diagnostic patch.
> 
> Here you go.
> 
> - Paul
> 
> [  248.902618] INFO: rcu_sched self-detected stall on CPU
> [  248.905456]  0: (1 ticks this GP) idle=933/1/0 
> [  248.907897]   (t=26570 jiffies g=11 c=10 q=0)

Bingo!!!  (q=0, in case you were wondering.  And thank you for testing this!)

Strangely enough, I believe that I have inadvertently fixed this in
my -rcu tree:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next

Nevertheless, if you get a chance to try it, I would be interested to
hear if my guess is correct.  The trick is that a kthread drives the
grace period in -rcu, regardless of whether or not there are callbacks.

However, the backport would not be something that -stable would be happy
with, so I will be putting together a fix for mainline.  This thing
has been in the kernel since about 2004, not sure why you didn't hit
it earlier.

							Thanx, Paul

> [  248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714)
> [  248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68)
> [  248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec)
> [  248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0)
> [  248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0)
> [  248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44)
> [  248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c)
> [  248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48)
> [  248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac)
> [  248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c)
> [  248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c)
> [  248.965484] Exception stack(0xc0729f58 to 0xc0729fa0)
> [  248.968231] 9f40:                                                       0003b832 00000001
> [  248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092
> [  248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff
> [  248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44)
> [  248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114)
> [  248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304)
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Paul Walmsley <paul@pwsan.com>
Cc: "Bruce, Becky" <bbruce@ti.com>,
	"Paul E. McKenney" <paul.mckenney@linaro.org>,
	"<linux-kernel@vger.kernel.org>" <linux-kernel@vger.kernel.org>,
	"<linux-omap@vger.kernel.org>" <linux-omap@vger.kernel.org>,
	"<linux-arm-kernel@lists.infradead.org>" 
	<linux-arm-kernel@lists.infradead.org>,
	"Hilman, Kevin" <khilman@ti.com>,
	"Shilimkar, Santosh" <santosh.shilimkar@ti.com>,
	"Hunter, Jon" <jon-hunter@ti.com>,
	"<snijsure@grid-net.com>" <snijsure@grid-net.com>
Subject: Re: rcu self-detected stall messages on OMAP3, 4 boards
Date: Sat, 22 Sep 2012 12:52:53 -0700	[thread overview]
Message-ID: <20120922195253.GD2934@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1209221813330.10663@utopia.booyaka.com>

On Sat, Sep 22, 2012 at 06:16:15PM +0000, Paul Walmsley wrote:
> Hi Paul
> 
> On Fri, 21 Sep 2012, Paul E. McKenney wrote:
> 
> > I am wondering if your system somehow figured out how to start a grace
> > period that had no RCU callbacks waiting for it.  If that happened,
> > then a CONFIG_NO_HZ=y system could in theory get into a state where all
> > CPUs are in dyntick-idle mode, so that none of them is doing anything
> > to force the grace period to complete.
> >
> > That should be easy to diagnose, anyway.  Please see below, which
> > includes the earlier diagnostic patch.
> 
> Here you go.
> 
> - Paul
> 
> [  248.902618] INFO: rcu_sched self-detected stall on CPU
> [  248.905456]  0: (1 ticks this GP) idle=933/1/0 
> [  248.907897]   (t=26570 jiffies g=11 c=10 q=0)

Bingo!!!  (q=0, in case you were wondering.  And thank you for testing this!)

Strangely enough, I believe that I have inadvertently fixed this in
my -rcu tree:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next

Nevertheless, if you get a chance to try it, I would be interested to
hear if my guess is correct.  The trick is that a kthread drives the
grace period in -rcu, regardless of whether or not there are callbacks.

However, the backport would not be something that -stable would be happy
with, so I will be putting together a fix for mainline.  This thing
has been in the kernel since about 2004, not sure why you didn't hit
it earlier.

							Thanx, Paul

> [  248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714)
> [  248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68)
> [  248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec)
> [  248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0)
> [  248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0)
> [  248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44)
> [  248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c)
> [  248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48)
> [  248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac)
> [  248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c)
> [  248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c)
> [  248.965484] Exception stack(0xc0729f58 to 0xc0729fa0)
> [  248.968231] 9f40:                                                       0003b832 00000001
> [  248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092
> [  248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff
> [  248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44)
> [  248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114)
> [  248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304)
> 


  reply	other threads:[~2012-09-22 19:52 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-12 22:51 rcu self-detected stall messages on OMAP3, 4 boards Paul Walmsley
2012-09-12 22:51 ` Paul Walmsley
2012-09-13  1:12 ` Paul E. McKenney
2012-09-13  1:12   ` Paul E. McKenney
2012-09-13 18:52   ` Paul Walmsley
2012-09-13 18:52     ` Paul Walmsley
2012-09-20  0:03     ` Paul E. McKenney
2012-09-20  0:03       ` Paul E. McKenney
2012-09-20  0:03       ` Paul E. McKenney
2012-09-20  7:56       ` Paul Walmsley
2012-09-20  7:56         ` Paul Walmsley
2012-09-20 15:03         ` Bruce, Becky
2012-09-20 15:03           ` Bruce, Becky
2012-09-20 21:49         ` Bruce, Becky
2012-09-20 21:49           ` Bruce, Becky
2012-09-20 22:01           ` Paul E. McKenney
2012-09-20 22:01             ` Paul E. McKenney
2012-09-20 22:01             ` Paul E. McKenney
2012-09-20 22:47             ` Paul Walmsley
2012-09-20 22:47               ` Paul Walmsley
2012-09-20 23:21               ` Paul E. McKenney
2012-09-20 23:21                 ` Paul E. McKenney
2012-09-20 23:21                 ` Paul E. McKenney
2012-09-21 18:08                 ` Paul Walmsley
2012-09-21 18:08                   ` Paul Walmsley
2012-09-21 18:58                   ` Paul E. McKenney
2012-09-21 18:58                     ` Paul E. McKenney
2012-09-21 19:11                     ` Paul Walmsley
2012-09-21 19:11                       ` Paul Walmsley
2012-09-21 19:57                       ` Paul E. McKenney
2012-09-21 19:57                         ` Paul E. McKenney
2012-09-21 20:31                         ` Tony Lindgren
2012-09-21 20:31                           ` Tony Lindgren
2012-09-21 22:03                           ` Paul E. McKenney
2012-09-21 22:03                             ` Paul E. McKenney
2012-09-22 15:45                             ` Frederic Weisbecker
2012-09-22 15:45                               ` Frederic Weisbecker
2012-09-22 16:00                               ` Paul E. McKenney
2012-09-22 16:00                                 ` Paul E. McKenney
2012-09-21 22:12                         ` Paul E. McKenney
2012-09-21 22:12                           ` Paul E. McKenney
2012-09-22 18:42                         ` Paul Walmsley
2012-09-22 18:42                           ` Paul Walmsley
2012-09-22 20:10                           ` Paul E. McKenney
2012-09-22 20:10                             ` Paul E. McKenney
2012-09-22 21:59                             ` Paul E. McKenney
2012-09-22 21:59                               ` Paul E. McKenney
2012-09-22 22:25                               ` Paul Walmsley
2012-09-22 22:25                                 ` Paul Walmsley
2012-09-22 23:11                                 ` Paul E. McKenney
2012-09-22 23:11                                   ` Paul E. McKenney
2012-09-22 23:11                                   ` Paul E. McKenney
2012-09-23  7:55                                   ` Paul Walmsley
2012-09-23  7:55                                     ` Paul Walmsley
2012-09-23  7:55                                     ` Paul Walmsley
2012-09-23 12:11                                     ` Paul E. McKenney
2012-09-23 12:11                                       ` Paul E. McKenney
2012-09-23 12:11                                       ` Paul E. McKenney
2012-09-23  1:42                                 ` Paul Walmsley
2012-09-23  1:42                                   ` Paul Walmsley
2012-09-23  1:56                                   ` Paul E. McKenney
2012-09-23  1:56                                     ` Paul E. McKenney
2012-09-23  1:56                                     ` Paul E. McKenney
2012-09-23  2:01                                     ` Paul Walmsley
2012-09-23  2:01                                       ` Paul Walmsley
2012-09-24  9:41                               ` Shilimkar, Santosh
2012-09-24  9:41                                 ` Shilimkar, Santosh
2012-09-24 13:18                                 ` Paul E. McKenney
2012-09-24 13:18                                   ` Paul E. McKenney
2012-10-01  8:55                               ` Linus Walleij
2012-10-01  8:55                                 ` Linus Walleij
2012-10-01 13:28                                 ` Paul E. McKenney
2012-10-01 13:28                                   ` Paul E. McKenney
2012-09-21 18:59                   ` Paul Walmsley
2012-09-21 18:59                     ` Paul Walmsley
2012-09-21 17:47               ` Paul Walmsley
2012-09-21 17:47                 ` Paul Walmsley
2012-09-21 17:51                 ` Paul Walmsley
2012-09-21 17:51                   ` Paul Walmsley
2012-09-21 21:20                 ` Paul E. McKenney
2012-09-21 21:20                   ` Paul E. McKenney
2012-09-21 21:20                   ` Paul E. McKenney
2012-09-21 22:41                   ` Paul Walmsley
2012-09-21 22:41                     ` Paul Walmsley
2012-09-22  0:05                     ` Paul E. McKenney
2012-09-22  0:05                       ` Paul E. McKenney
2012-09-22 18:16                       ` Paul Walmsley
2012-09-22 18:16                         ` Paul Walmsley
2012-09-22 18:16                         ` Paul Walmsley
2012-09-22 19:52                         ` Paul E. McKenney [this message]
2012-09-22 19:52                           ` Paul E. McKenney
2012-09-22 19:52                           ` Paul E. McKenney
2012-09-22 22:20                           ` Paul Walmsley
2012-09-22 22:20                             ` Paul Walmsley
2012-09-22 22:20                             ` Paul Walmsley
2012-09-22 23:17                             ` Paul E. McKenney
2012-09-22 23:17                               ` Paul E. McKenney
2012-09-24 21:54                               ` Paul Walmsley
2012-09-24 21:54                                 ` Paul Walmsley
2012-09-24 22:00                                 ` Paul E. McKenney
2012-09-24 22:00                                   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120922195253.GD2934@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=bbruce@ti.com \
    --cc=jon-hunter@ti.com \
    --cc=khilman@ti.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=paul.mckenney@linaro.org \
    --cc=paul@pwsan.com \
    --cc=santosh.shilimkar@ti.com \
    --cc=snijsure@grid-net.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.