* 2.6.17 x86_64 regression - reboot fails due to deadlock
@ 2006-07-06 16:18 Mr. Berkley Shands
2006-07-06 16:23 ` Arjan van de Ven
2006-07-07 0:24 ` Andrew Morton
0 siblings, 2 replies; 3+ messages in thread
From: Mr. Berkley Shands @ 2006-07-06 16:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Dave Lloyd
With a SuperMicro H8DC8 (nvidia chipset), Dual Opteron 285's, 16GB,
Centos 4.3 -
Under 2.6.16 both the tyan 2895 and the supermicro H8DC8 both will
reboot corectly,
in kernel/sys.c machine_restart() gets called. But with the changes to
sys.c under 2.6.17,
a new path is introduced, calling void kernel_restart_prepare(char *cmd)
which calls blocking_notifier_call_chain(&reboot_notifier_list,
SYS_RESTART, cmd); (line 588)
Which looks at the first element of the notifier list, and blocks
forever. But ONLY on the supermicro.
The tyan, a very similar motherboard does not deadlock. It returns and
still calls machine_restart().
So neither reboot nor "shutdown -fh now" actually get to the bios calls.
on the supermicro, (linux-2.6.17/kernel/sys.c)
static int __kprobes notifier_call_chain(struct notifier_block **nl,
unsigned long val, void *v)
{
int ret = NOTIFY_DONE;
struct notifier_block *nb;
nb = rcu_dereference(*nl);
while (nb) {
ret = nb->notifier_call(nb, val, v); /* this is
the deadlock for the first entry */
if ((ret & NOTIFY_STOP_MASK) == NOTIFY_STOP_MASK)
break;
nb = rcu_dereference(nb->next);
}
return ret;
}
I see that 2.6.18 reworks this code further.
If I want to hurt myself really, really badly, disabling the call to
blocking_notifier_call_chain(&reboot_notifier_list,...
restores the reboot/power off functions.
In kdb, the system sits idle awaiting something to schedule, but nothing
will schedule since there is
a deadlock on the supermicro. Any clues as to how to find which notifier
is deadlocked?
berkley
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: 2.6.17 x86_64 regression - reboot fails due to deadlock 2006-07-06 16:18 2.6.17 x86_64 regression - reboot fails due to deadlock Mr. Berkley Shands @ 2006-07-06 16:23 ` Arjan van de Ven 2006-07-07 0:24 ` Andrew Morton 1 sibling, 0 replies; 3+ messages in thread From: Arjan van de Ven @ 2006-07-06 16:23 UTC (permalink / raw) To: Mr. Berkley Shands; +Cc: linux-kernel, Dave Lloyd > In kdb, the system sits idle awaiting something to schedule, but nothing > will schedule since there is > a deadlock on the supermicro. Any clues as to how to find which notifier > is deadlocked? Hi, if it's really a deadlock, then lockdep (new in 2.6.18-rc1) ought to find it... just enable the various locking debug options in the -rc1 kernel and... it's active. Greetings, Arjan van de Ven ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 2.6.17 x86_64 regression - reboot fails due to deadlock 2006-07-06 16:18 2.6.17 x86_64 regression - reboot fails due to deadlock Mr. Berkley Shands 2006-07-06 16:23 ` Arjan van de Ven @ 2006-07-07 0:24 ` Andrew Morton 1 sibling, 0 replies; 3+ messages in thread From: Andrew Morton @ 2006-07-07 0:24 UTC (permalink / raw) To: Mr. Berkley Shands; +Cc: linux-kernel, dlloyd "Mr. Berkley Shands" <bshands@exegy.com> wrote: > > With a SuperMicro H8DC8 (nvidia chipset), Dual Opteron 285's, 16GB, > Centos 4.3 - > > Under 2.6.16 both the tyan 2895 and the supermicro H8DC8 both will > reboot corectly, > in kernel/sys.c machine_restart() gets called. But with the changes to > sys.c under 2.6.17, > a new path is introduced, calling void kernel_restart_prepare(char *cmd) > which calls blocking_notifier_call_chain(&reboot_notifier_list, > SYS_RESTART, cmd); (line 588) > Which looks at the first element of the notifier list, and blocks > forever. But ONLY on the supermicro. > The tyan, a very similar motherboard does not deadlock. It returns and > still calls machine_restart(). > So neither reboot nor "shutdown -fh now" actually get to the bios calls. > > on the supermicro, (linux-2.6.17/kernel/sys.c) > > static int __kprobes notifier_call_chain(struct notifier_block **nl, > unsigned long val, void *v) > { > int ret = NOTIFY_DONE; > struct notifier_block *nb; > > nb = rcu_dereference(*nl); > while (nb) { > ret = nb->notifier_call(nb, val, v); /* this is > the deadlock for the first entry */ > if ((ret & NOTIFY_STOP_MASK) == NOTIFY_STOP_MASK) > break; > nb = rcu_dereference(nb->next); > } > return ret; > } > > I see that 2.6.18 reworks this code further. > > If I want to hurt myself really, really badly, disabling the call to > blocking_notifier_call_chain(&reboot_notifier_list,... > restores the reboot/power off functions. > > In kdb, the system sits idle awaiting something to schedule, but nothing > will schedule since there is > a deadlock on the supermicro. Any clues as to how to find which notifier > is deadlocked? > Are you able to do sysrq-T when it's stuck? Something like this... diff -puN kernel/sys.c~a kernel/sys.c --- a/kernel/sys.c~a +++ a/kernel/sys.c @@ -70,6 +70,8 @@ int overflowuid = DEFAULT_OVERFLOWUID; int overflowgid = DEFAULT_OVERFLOWGID; +static int foo; + #ifdef CONFIG_UID16 EXPORT_SYMBOL(overflowuid); EXPORT_SYMBOL(overflowgid); @@ -141,6 +143,9 @@ static int __kprobes notifier_call_chain nb = rcu_dereference(*nl); while (nb) { next_nb = rcu_dereference(nb->next); + if (foo) + print_symbol("calling %s()\n", + (unsigned long)nb->notifier_call); ret = nb->notifier_call(nb, val, v); if ((ret & NOTIFY_STOP_MASK) == NOTIFY_STOP_MASK) break; @@ -590,6 +595,7 @@ EXPORT_SYMBOL_GPL(emergency_restart); static void kernel_restart_prepare(char *cmd) { + foo = 1; blocking_notifier_call_chain(&reboot_notifier_list, SYS_RESTART, cmd); system_state = SYSTEM_RESTART; device_shutdown(); _ Be aware that there's a known lock_cpu_hotplug()-vs-cpufreq deadlock, but afaik it's only been reported during suspend. Disabling CONFIG_HOTPLUG_CPU might make a difference. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-07-07 0:20 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-06 16:18 2.6.17 x86_64 regression - reboot fails due to deadlock Mr. Berkley Shands 2006-07-06 16:23 ` Arjan van de Ven 2006-07-07 0:24 ` Andrew Morton
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.