From: Peter Zijlstra <peterz@infradead.org>
To: Valdis.Kletnieks@vt.edu
Cc: akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org
Subject: Re: mmotm 2011-04-14 - lockdep splats in sched.c during boot
Date: Fri, 15 Apr 2011 17:52:21 +0200 [thread overview]
Message-ID: <1302882741.2388.241.camel@twins> (raw)
In-Reply-To: <9629.1302879429@localhost>
On Fri, 2011-04-15 at 10:57 -0400, Valdis.Kletnieks@vt.edu wrote:
> On Thu, 14 Apr 2011 15:08:47 PDT, akpm@linux-foundation.org said:
> > The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
>
> This throws at least two complaints about lockdep on the way up. I've had
> several complete hangs as well last night during boot following a WARN in
> sched.c, but didn't have netconsole or a camera handy at the time. Will follow up if I
> catch one.
That would be most appreciated, I merged two large series of scheduler
patches.
> Both whinges point at a 'for_each_domain()'. Not sure why I
> haven't seen mention on lkml before - what am I doing different?
Probably running a very fresh kernel..
> Splat number 1:
> [ 0.044382] smpboot cpu 1: start_ip = 99000
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526877 timer_rate_min=2526840 pre_start=520283431585 pre_end=520308700132
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526857 timer_rate_min=2526829 pre_start=520313753438 pre_end=520339021871
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526851 timer_rate_min=2526824 pre_start=520344075709 pre_end=520369344094
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526862 timer_rate_min=2526834 pre_start=520374397819 pre_end=520399666308
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526864 timer_rate_min=2526836 pre_start=520404719957 pre_end=520429988465
> [ 0.116010]
> [ 0.116011] ===================================================
> [ 0.116989] [ INFO: suspicious rcu_dereference_check() usage. ]
> [ 0.116989] ---------------------------------------------------
> [ 0.116989] kernel/sched.c:2426 invoked rcu_dereference_check() without protection!
> [ 0.116989]
> [ 0.116989] other info that might help us debug this:
> [ 0.116989]
> [ 0.116989]
> [ 0.116989] rcu_scheduler_active = 1, debug_locks = 1
> [ 0.116989] 2 locks held by swapper/1:
> [ 0.116989] #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810394d2>] cpu_maps_update_begin+0x12/0x14
> [ 0.116989] #1: (&p->pi_lock){-.....}, at: [<ffffffff81032959>] try_to_wake_up+0x29/0x1aa
> [ 0.116989]
> [ 0.116989] stack backtrace:
> [ 0.116989] Pid: 1, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 0.116989] Call Trace:
> [ 0.116989] [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
> [ 0.116989] [<ffffffff8102acd0>] ttwu_stat+0xcc/0xf5
> [ 0.116989] [<ffffffff81032ab5>] try_to_wake_up+0x185/0x1aa
> [ 0.116989] [<ffffffff81b5540a>] ? migration_call+0x9e/0xd0
> [ 0.116989] [<ffffffff81564643>] ? _raw_spin_unlock_irqrestore+0x46/0x80
> [ 0.116989] [<ffffffff81032b06>] wake_up_process+0x10/0x12
> [ 0.116989] [<ffffffff81b56207>] cpu_stop_cpu_callback+0xe5/0x11b
> [ 0.116989] [<ffffffff81567abe>] notifier_call_chain+0x54/0x81
> [ 0.116989] [<ffffffff810596bc>] __raw_notifier_call_chain+0x9/0xb
> [ 0.116989] [<ffffffff815434d1>] __cpu_notify+0x1b/0x2d
> [ 0.116989] [<ffffffff81b55709>] _cpu_up.constprop.0+0xd1/0xe5
> [ 0.116989] [<ffffffff81b55757>] cpu_up+0x3a/0x47
> [ 0.116989] [<ffffffff81b2f3d2>] smp_init+0x41/0x93
> [ 0.116989] [<ffffffff81b1dbc5>] kernel_init+0x9d/0x15b
> [ 0.116989] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
> [ 0.116989] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
> [ 0.116989] [<ffffffff81b1db28>] ? start_kernel+0x394/0x394
> [ 0.116989] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
> [ 0.117089] NMI watchdog enabled, takes one hw-pmu counter.
> [ 0.119006] Brought up 2 CPUs
>
> Splat number 2:
> [ 1.179319] netconsole: remote ethernet address 00:b0:d0:c3:bd:a7
> [ 1.179430] netconsole: device eth0 not up yet, forcing it
> [ 1.247705] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
> [ 1.298111] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
> [ 1.298312]
> [ 1.298313] ===================================================
> [ 1.298516] [ INFO: suspicious rcu_dereference_check() usage. ]
> [ 1.298623] ---------------------------------------------------
> [ 1.298731] kernel/sched.c:1211 invoked rcu_dereference_check() without protection!
> [ 1.298858]
> [ 1.298858] other info that might help us debug this:
> [ 1.298859]
> [ 1.299152]
> [ 1.299152] rcu_scheduler_active = 1, debug_locks = 1
> [ 1.299294] 1 lock held by swapper/0:
> [ 1.299294] #0: (&(&base->lock)->rlock){-.-.-.}, at: [<ffffffff810443fd>] lock_timer_base+0x49/0x92
> [ 1.299294]
> [ 1.299294] stack backtrace:
> [ 1.299294] Pid: 0, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 1.299294] Call Trace:
> [ 1.299294] <IRQ> [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
> [ 1.299294] [<ffffffff810337a7>] get_nohz_timer_target+0x79/0xbe
> [ 1.299294] [<ffffffff810452ec>] __mod_timer+0xc7/0x16d
> [ 1.299294] [<ffffffff810454bf>] mod_timer+0x87/0x8e
> [ 1.299294] [<ffffffff8130814c>] e1000_intr_msi+0xa2/0xef
> [ 1.299294] [<ffffffff8108acab>] handle_irq_event_percpu+0xba/0x29f
> [ 1.299294] [<ffffffff8108aecc>] handle_irq_event+0x3c/0x5c
> [ 1.299294] [<ffffffff810193c6>] ? ack_APIC_irq+0x10/0x12
> [ 1.299294] [<ffffffff8108d197>] handle_edge_irq+0xf4/0x121
> [ 1.299294] [<ffffffff810031aa>] handle_irq+0x122/0x133
> [ 1.299294] [<ffffffff81002fdf>] do_IRQ+0x48/0xa0
> [ 1.299294] [<ffffffff81564cd3>] common_interrupt+0x13/0x13
> [ 1.299294] <EOI> [<ffffffff81008009>] ? default_idle+0x52/0x89
> [ 1.299294] [<ffffffff81008007>] ? default_idle+0x50/0x89
> [ 1.299294] [<ffffffff8100084c>] cpu_idle+0x87/0x102
> [ 1.299294] [<ffffffff81535587>] rest_init+0xcb/0xd2
> [ 1.299294] [<ffffffff815354bc>] ? csum_partial_copy_generic+0x16c/0x16c
> [ 1.299294] [<ffffffff81b1db1d>] start_kernel+0x389/0x394
> [ 1.299294] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3
> [ 1.299294] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7
> [ 1.309814] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>
The below should cure those two I think.
---
kernel/sched.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 0cfe031..cd06b53 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1208,11 +1208,13 @@ int get_nohz_timer_target(void)
int i;
struct sched_domain *sd;
+ rcu_read_lock();
for_each_domain(cpu, sd) {
for_each_cpu(i, sched_domain_span(sd))
if (!idle_cpu(i))
return i;
}
+ rcu_read_unlock();
return cpu;
}
/*
@@ -2415,12 +2417,14 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
struct sched_domain *sd;
schedstat_inc(p, se.statistics.nr_wakeups_remote);
+ rcu_read_lock();
for_each_domain(this_cpu, sd) {
if (cpumask_test_cpu(cpu, sched_domain_span(sd))) {
schedstat_inc(sd, ttwu_wake_remote);
break;
}
}
+ rcu_read_unlock();
}
#endif /* CONFIG_SMP */
next prev parent reply other threads:[~2011-04-15 15:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-14 22:08 mmotm 2011-04-14-15-08 uploaded akpm
2011-04-14 22:08 ` akpm
2011-04-15 14:57 ` mmotm 2011-04-14 - lockdep splats in sched.c during boot Valdis.Kletnieks
2011-04-15 15:52 ` Peter Zijlstra [this message]
2011-04-19 12:06 ` [tip:sched/core] sched: Fix sched_domain iterations vs. RCU tip-bot for Peter Zijlstra
2011-04-15 18:53 ` mmotm 2011-04-14 - hangs during boot Valdis.Kletnieks
2011-04-15 19:06 ` Peter Zijlstra
2011-04-15 19:28 ` Valdis.Kletnieks
2011-04-15 19:35 ` Peter Zijlstra
2011-04-15 19:38 ` Valdis.Kletnieks
2011-04-15 15:50 ` mmotm 2011-04-14-15-08 uploaded (leds) Randy Dunlap
2011-04-15 16:12 ` mmotm 2011-04-14-15-08 uploaded (staging/gma500) Randy Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1302882741.2388.241.camel@twins \
--to=peterz@infradead.org \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.