From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: John Stultz <john.stultz@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
linux-kernel@vger.kernel.org
Subject: Re: RCU lockup in the SMP idle thread, help...
Date: Thu, 13 Sep 2012 09:58:44 -0700 [thread overview]
Message-ID: <20120913165844.GW4257@linux.vnet.ibm.com> (raw)
In-Reply-To: <50520E8A.9030408@linaro.org>
On Thu, Sep 13, 2012 at 09:49:14AM -0700, John Stultz wrote:
> On 09/13/2012 05:36 AM, Linus Walleij wrote:
> >Hi Paul et al,
> >
> >I have this sporadic lockup in the SMP idle thread on ARM U8500:
> >
> >root@ME:/
> >root@ME:/
> >root@ME:/ INFO: rcu_preempt detected stalls on CPUs/tasks: { 0}
> >(detected by 1, t=23190 jiffies)
> >[<c0014710>] (unwind_backtrace+0x0/0xf8) from [<c0068624>]
> >(rcu_check_callbacks+0x69c/0x6e0)
> >[<c0068624>] (rcu_check_callbacks+0x69c/0x6e0) from [<c0029cbc>]
> >(update_process_times+0x38/0x4c)
> >[<c0029cbc>] (update_process_times+0x38/0x4c) from [<c0055088>]
> >(tick_sched_timer+0x80/0xe4)
> >[<c0055088>] (tick_sched_timer+0x80/0xe4) from [<c003c120>]
> >(__run_hrtimer.isra.18+0x44/0xd0)
> >[<c003c120>] (__run_hrtimer.isra.18+0x44/0xd0) from [<c003cae0>]
> >(hrtimer_interrupt+0x118/0x2b4)
> >[<c003cae0>] (hrtimer_interrupt+0x118/0x2b4) from [<c0013658>]
> >(twd_handler+0x30/0x44)
> >[<c0013658>] (twd_handler+0x30/0x44) from [<c0063834>]
> >(handle_percpu_devid_irq+0x80/0xa0)
> >[<c0063834>] (handle_percpu_devid_irq+0x80/0xa0) from [<c00601ec>]
> >(generic_handle_irq+0x2c/0x40)
> >[<c00601ec>] (generic_handle_irq+0x2c/0x40) from [<c000ef58>]
> >(handle_IRQ+0x4c/0xac)
> >[<c000ef58>] (handle_IRQ+0x4c/0xac) from [<c00084bc>] (gic_handle_irq+0x24/0x58)
> >[<c00084bc>] (gic_handle_irq+0x24/0x58) from [<c000dc80>] (__irq_svc+0x40/0x70)
> >Exception stack(0xcf851f88 to 0xcf851fd0)
> >1f80: 00000020 c05d5920 00000001 00000000 cf850000 cf850000
> >1fa0: c05f4d48 c02de0b4 c05d8d90 412fc091 cf850000 00000000 01000000 cf851fd0
> >1fc0: c000f234 c000f238 60000013 ffffffff
> >[<c000dc80>] (__irq_svc+0x40/0x70) from [<c000f238>] (default_idle+0x28/0x30)
> >[<c000f238>] (default_idle+0x28/0x30) from [<c000f438>] (cpu_idle+0x98/0xe4)
> >[<c000f438>] (cpu_idle+0x98/0xe4) from [<002d2ef4>] (0x2d2ef4)
> >
> >The hangup has been there in the v3.6-rc series for a while (probably
> >since the merge window).
> >
> >I haven't been able to bisect out why this is happening, because the bug
> >is pretty hazardous to check - you have to boot the system and leave it alone
> >or use it sporadically for a while. Then all of a sudden it happens.
> >
> >So: reproducible, but not deterministically reproducible (I hate this kind
> >of thing...)
> >
> >The code involved seems to be generic kernel code apart from the
> >ARM GIC and TWD timer drivers.
> >
> >Any hints or debug options I should switch on?
>
> I saw this once as well testing the fix to Daniel's deep idle hang
> issue (also on 32 bit).
>
> Really briefly looking at the code in rcutree.c, I'm curious if
> we're hitting a false positive on the 5 minute jiffies overflow?
Hmmm... Might be. Does the patch below help?
Thanx, Paul
------------------------------------------------------------------------
rcu: Avoid spurious RCU CPU stall warnings
If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings. This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3d63d1c..aea3157 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -819,7 +819,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
j = ACCESS_ONCE(jiffies);
js = ACCESS_ONCE(rsp->jiffies_stall);
rnp = rdp->mynode;
- if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
+ if (rcu_gp_in_progress(rsp) &&
+ (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall(rsp);
next prev parent reply other threads:[~2012-09-13 16:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-13 12:36 RCU lockup in the SMP idle thread, help Linus Walleij
2012-09-13 16:49 ` John Stultz
2012-09-13 16:58 ` Paul E. McKenney [this message]
2012-09-14 7:27 ` Linus Walleij
2012-09-14 17:53 ` Paul E. McKenney
2012-09-17 9:10 ` Linus Walleij
2012-09-20 0:03 ` Paul E. McKenney
2012-09-20 8:46 ` Linus Walleij
2012-09-20 17:49 ` Nicolas Pitre
2012-09-20 9:07 ` Linus Walleij
2012-09-20 14:10 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120913165844.GW4257@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=daniel.lezcano@linaro.org \
--cc=john.stultz@linaro.org \
--cc=linus.walleij@linaro.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox