From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>,
Jonathan.Cameron@huawei.com, dzickus@redhat.com,
sfr@canb.auug.org.au, linuxarm@huawei.com,
abdhalee@linux.vnet.ibm.com, sparclinux@vger.kernel.org,
akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
Date: Thu, 27 Jul 2017 05:49:13 -0700 [thread overview]
Message-ID: <20170727124913.GL3730@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170727143400.23e4d2b2@roar.ozlabs.ibm.com>
On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
>
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.
> >
> > I have seen stranger things, but admittedly not often.
>
> So the backtraces show the RCU gp thread in schedule_timeout.
>
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
>
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:
Good point, Nick!
Jonathan, could you please reproduce collecting timer event tracing?
Thanx, Paul
> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt S 0 9 2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
>
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
Date: Thu, 27 Jul 2017 12:49:13 +0000 [thread overview]
Message-ID: <20170727124913.GL3730@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170727143400.23e4d2b2@roar.ozlabs.ibm.com>
On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
>
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.
> >
> > I have seen stranger things, but admittedly not often.
>
> So the backtraces show the RCU gp thread in schedule_timeout.
>
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
>
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:
Good point, Nick!
Jonathan, could you please reproduce collecting timer event tracing?
Thanx, Paul
> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt S 0 9 2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
>
WARNING: multiple messages have this Message-ID (diff)
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
Date: Thu, 27 Jul 2017 05:49:13 -0700 [thread overview]
Message-ID: <20170727124913.GL3730@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170727143400.23e4d2b2@roar.ozlabs.ibm.com>
On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
>
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.
> >
> > I have seen stranger things, but admittedly not often.
>
> So the backtraces show the RCU gp thread in schedule_timeout.
>
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
>
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:
Good point, Nick!
Jonathan, could you please reproduce collecting timer event tracing?
Thanx, Paul
> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt S 0 9 2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
>
next prev parent reply other threads:[~2017-07-27 12:50 UTC|newest]
Thread overview: 241+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-25 11:32 RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Jonathan Cameron
2017-07-25 12:26 ` Nicholas Piggin
2017-07-25 12:26 ` Nicholas Piggin
2017-07-25 13:46 ` Paul E. McKenney
2017-07-25 13:46 ` Paul E. McKenney
2017-07-25 13:46 ` Paul E. McKenney
2017-07-25 14:42 ` Jonathan Cameron
2017-07-25 14:42 ` Jonathan Cameron
2017-07-25 14:42 ` Jonathan Cameron
2017-07-25 15:12 ` Paul E. McKenney
2017-07-25 15:12 ` Paul E. McKenney
2017-07-25 15:12 ` Paul E. McKenney
2017-07-25 16:52 ` Jonathan Cameron
2017-07-25 16:52 ` Jonathan Cameron
2017-07-25 16:52 ` Jonathan Cameron
2017-07-25 21:10 ` David Miller
2017-07-25 21:10 ` David Miller
2017-07-25 21:10 ` David Miller
2017-07-26 3:55 ` Paul E. McKenney
2017-07-26 3:55 ` Paul E. McKenney
2017-07-26 3:55 ` Paul E. McKenney
2017-07-26 4:02 ` David Miller
2017-07-26 4:02 ` David Miller
2017-07-26 4:02 ` David Miller
2017-07-26 4:12 ` Paul E. McKenney
2017-07-26 4:12 ` Paul E. McKenney
2017-07-26 4:12 ` Paul E. McKenney
2017-07-26 8:16 ` Jonathan Cameron
2017-07-26 8:16 ` Jonathan Cameron
2017-07-26 8:16 ` Jonathan Cameron
2017-07-26 9:32 ` Jonathan Cameron
2017-07-26 9:32 ` Jonathan Cameron
2017-07-26 9:32 ` Jonathan Cameron
2017-07-26 12:28 ` Jonathan Cameron
2017-07-26 12:28 ` Jonathan Cameron
2017-07-26 12:28 ` Jonathan Cameron
2017-07-26 12:49 ` Jonathan Cameron
2017-07-26 12:49 ` Jonathan Cameron
2017-07-26 12:49 ` Jonathan Cameron
2017-07-26 14:14 ` Paul E. McKenney
2017-07-26 14:14 ` Paul E. McKenney
2017-07-26 14:14 ` Paul E. McKenney
2017-07-26 14:23 ` Jonathan Cameron
2017-07-26 14:23 ` Jonathan Cameron
2017-07-26 14:23 ` Jonathan Cameron
2017-07-26 15:33 ` Jonathan Cameron
2017-07-26 15:33 ` Jonathan Cameron
2017-07-26 15:33 ` Jonathan Cameron
2017-07-26 15:49 ` Paul E. McKenney
2017-07-26 15:49 ` Paul E. McKenney
2017-07-26 15:49 ` Paul E. McKenney
2017-07-26 16:54 ` David Miller
2017-07-26 16:54 ` David Miller
2017-07-26 16:54 ` David Miller
2017-07-26 17:13 ` Jonathan Cameron
2017-07-26 17:13 ` Jonathan Cameron
2017-07-26 17:13 ` Jonathan Cameron
2017-07-27 7:41 ` Jonathan Cameron
2017-07-27 7:41 ` Jonathan Cameron
2017-07-27 7:41 ` Jonathan Cameron
2017-07-26 17:50 ` Paul E. McKenney
2017-07-26 17:50 ` Paul E. McKenney
2017-07-26 17:50 ` Paul E. McKenney
2017-07-26 22:36 ` Paul E. McKenney
2017-07-26 22:36 ` Paul E. McKenney
2017-07-26 22:36 ` Paul E. McKenney
2017-07-26 22:45 ` David Miller
2017-07-26 22:45 ` David Miller
2017-07-26 22:45 ` David Miller
2017-07-26 23:15 ` Paul E. McKenney
2017-07-26 23:15 ` Paul E. McKenney
2017-07-26 23:15 ` Paul E. McKenney
2017-07-26 23:22 ` David Miller
2017-07-26 23:22 ` David Miller
2017-07-26 23:22 ` David Miller
2017-07-27 1:42 ` Paul E. McKenney
2017-07-27 1:42 ` Paul E. McKenney
2017-07-27 1:42 ` Paul E. McKenney
2017-07-27 4:34 ` Nicholas Piggin
2017-07-27 4:34 ` Nicholas Piggin
2017-07-27 4:34 ` Nicholas Piggin
2017-07-27 12:49 ` Paul E. McKenney [this message]
2017-07-27 12:49 ` Paul E. McKenney
2017-07-27 12:49 ` Paul E. McKenney
2017-07-27 13:49 ` Jonathan Cameron
2017-07-27 13:49 ` Jonathan Cameron
2017-07-27 13:49 ` Jonathan Cameron
2017-07-27 16:39 ` Jonathan Cameron
2017-07-27 16:39 ` Jonathan Cameron
2017-07-27 16:39 ` Jonathan Cameron
2017-07-27 16:52 ` Paul E. McKenney
2017-07-27 16:52 ` Paul E. McKenney
2017-07-27 16:52 ` Paul E. McKenney
2017-07-28 7:44 ` Jonathan Cameron
2017-07-28 7:44 ` Jonathan Cameron
2017-07-28 7:44 ` Jonathan Cameron
2017-07-28 12:54 ` Boqun Feng
2017-07-28 12:54 ` Boqun Feng
2017-07-28 12:54 ` Boqun Feng
2017-07-28 13:13 ` Jonathan Cameron
2017-07-28 13:13 ` Jonathan Cameron
2017-07-28 13:13 ` Jonathan Cameron
2017-07-28 14:55 ` Paul E. McKenney
2017-07-28 14:55 ` Paul E. McKenney
2017-07-28 14:55 ` Paul E. McKenney
2017-07-28 18:41 ` Paul E. McKenney
2017-07-28 18:41 ` Paul E. McKenney
2017-07-28 18:41 ` Paul E. McKenney
2017-07-28 19:09 ` Paul E. McKenney
2017-07-28 19:09 ` Paul E. McKenney
2017-07-28 19:09 ` Paul E. McKenney
2017-07-30 13:37 ` Boqun Feng
2017-07-30 13:37 ` Boqun Feng
2017-07-30 13:37 ` Boqun Feng
2017-07-30 16:59 ` Paul E. McKenney
2017-07-30 16:59 ` Paul E. McKenney
2017-07-30 16:59 ` Paul E. McKenney
2017-07-29 1:20 ` Boqun Feng
2017-07-29 1:20 ` Boqun Feng
2017-07-29 1:20 ` Boqun Feng
2017-07-28 18:42 ` David Miller
2017-07-28 18:42 ` David Miller
2017-07-28 18:42 ` David Miller
2017-07-28 13:08 ` Jonathan Cameron
2017-07-28 13:08 ` Jonathan Cameron
2017-07-28 13:24 ` Jonathan Cameron
2017-07-28 13:24 ` Jonathan Cameron
2017-07-28 13:24 ` Jonathan Cameron
2017-07-28 16:55 ` Paul E. McKenney
2017-07-28 16:55 ` Paul E. McKenney
2017-07-28 17:27 ` Jonathan Cameron
2017-07-28 17:27 ` Jonathan Cameron
2017-07-28 17:27 ` Jonathan Cameron
2017-07-28 19:03 ` Paul E. McKenney
2017-07-28 19:03 ` Paul E. McKenney
2017-07-28 19:03 ` Paul E. McKenney
2017-07-31 11:08 ` Jonathan Cameron
2017-07-31 11:08 ` Jonathan Cameron
2017-07-31 11:08 ` Jonathan Cameron
2017-07-31 15:04 ` Paul E. McKenney
2017-07-31 15:04 ` Paul E. McKenney
2017-07-31 15:04 ` Paul E. McKenney
2017-07-31 15:27 ` Jonathan Cameron
2017-07-31 15:27 ` Jonathan Cameron
2017-07-31 15:27 ` Jonathan Cameron
2017-08-01 18:46 ` Paul E. McKenney
2017-08-01 18:46 ` Paul E. McKenney
2017-08-01 18:46 ` Paul E. McKenney
2017-08-02 16:25 ` Jonathan Cameron
2017-08-02 16:25 ` Jonathan Cameron
2017-08-02 16:25 ` Jonathan Cameron
2017-08-15 15:47 ` Paul E. McKenney
2017-08-15 15:47 ` Paul E. McKenney
2017-08-15 15:47 ` Paul E. McKenney
2017-08-16 1:24 ` Jonathan Cameron
2017-08-16 1:24 ` Jonathan Cameron
2017-08-16 1:24 ` Jonathan Cameron
2017-08-16 12:43 ` Michael Ellerman
2017-08-16 12:43 ` Michael Ellerman
2017-08-16 12:43 ` Michael Ellerman
2017-08-16 12:56 ` Paul E. McKenney
2017-08-16 12:56 ` Paul E. McKenney
2017-08-16 12:56 ` Paul E. McKenney
2017-08-16 15:31 ` Nicholas Piggin
2017-08-16 15:31 ` Nicholas Piggin
2017-08-16 15:31 ` Nicholas Piggin
2017-08-16 16:27 ` Paul E. McKenney
2017-08-16 16:27 ` Paul E. McKenney
2017-08-16 16:27 ` Paul E. McKenney
2017-08-17 13:55 ` Michael Ellerman
2017-08-17 13:55 ` Michael Ellerman
2017-08-17 13:55 ` Michael Ellerman
2017-08-20 4:45 ` Nicholas Piggin
2017-08-20 4:45 ` Nicholas Piggin
2017-08-20 4:45 ` Nicholas Piggin
2017-08-20 5:01 ` David Miller
2017-08-20 5:01 ` David Miller
2017-08-20 5:01 ` David Miller
2017-08-20 5:04 ` Paul E. McKenney
2017-08-20 5:04 ` Paul E. McKenney
2017-08-20 5:04 ` Paul E. McKenney
2017-08-20 13:00 ` Nicholas Piggin
2017-08-20 13:00 ` Nicholas Piggin
2017-08-20 13:00 ` Nicholas Piggin
2017-08-20 18:35 ` Paul E. McKenney
2017-08-20 18:35 ` Paul E. McKenney
2017-08-20 18:35 ` Paul E. McKenney
2017-08-20 21:14 ` Paul E. McKenney
2017-08-20 21:14 ` Paul E. McKenney
2017-08-20 21:14 ` Paul E. McKenney
2017-08-21 0:52 ` Nicholas Piggin
2017-08-21 0:52 ` Nicholas Piggin
2017-08-21 0:52 ` Nicholas Piggin
2017-08-21 6:06 ` Nicholas Piggin
2017-08-21 6:06 ` Nicholas Piggin
2017-08-21 6:06 ` Nicholas Piggin
2017-08-21 10:18 ` Jonathan Cameron
2017-08-21 10:18 ` Jonathan Cameron
2017-08-21 10:18 ` Jonathan Cameron
2017-08-21 14:19 ` Nicholas Piggin
2017-08-21 14:19 ` Nicholas Piggin
2017-08-21 14:19 ` Nicholas Piggin
2017-08-21 15:02 ` Jonathan Cameron
2017-08-21 15:02 ` Jonathan Cameron
2017-08-21 15:02 ` Jonathan Cameron
2017-08-21 20:55 ` David Miller
2017-08-21 20:55 ` David Miller
2017-08-21 20:55 ` David Miller
2017-08-22 7:49 ` Jonathan Cameron
2017-08-22 7:49 ` Jonathan Cameron
2017-08-22 7:49 ` Jonathan Cameron
2017-08-22 8:51 ` Abdul Haleem
2017-08-22 8:51 ` Abdul Haleem
2017-08-22 8:51 ` Abdul Haleem
2017-08-22 15:26 ` Paul E. McKenney
2017-08-22 15:26 ` Paul E. McKenney
2017-08-22 15:26 ` Paul E. McKenney
2017-09-06 12:28 ` Paul E. McKenney
2017-09-06 12:28 ` Paul E. McKenney
2017-09-06 12:28 ` Paul E. McKenney
2017-08-22 0:38 ` Paul E. McKenney
2017-08-22 0:38 ` Paul E. McKenney
2017-08-22 0:38 ` Paul E. McKenney
2017-07-31 11:09 ` Jonathan Cameron
2017-07-31 11:09 ` Jonathan Cameron
2017-07-31 11:09 ` Jonathan Cameron
2017-07-31 11:55 ` Jonathan Cameron
2017-07-31 11:55 ` Jonathan Cameron
2017-07-31 11:55 ` Jonathan Cameron
2017-08-01 10:53 ` Jonathan Cameron
2017-08-01 10:53 ` Jonathan Cameron
2017-08-01 10:53 ` Jonathan Cameron
2017-07-26 16:48 ` David Miller
2017-07-26 16:48 ` David Miller
2017-07-26 16:48 ` David Miller
2017-07-26 3:53 ` Paul E. McKenney
2017-07-26 3:53 ` Paul E. McKenney
2017-07-26 3:53 ` Paul E. McKenney
2017-07-26 7:51 ` Jonathan Cameron
2017-07-26 7:51 ` Jonathan Cameron
2017-07-26 7:51 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170727124913.GL3730@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=abdhalee@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dzickus@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linuxarm@huawei.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@gmail.com \
--cc=sfr@canb.auug.org.au \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.