From: Andrea Arcangeli <andrea@suse.de>
To: Hannes Reinecke <Hannes.Reinecke@suse.de>
Cc: Dave Hansen <haveblue@us.ibm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
uweigand@de.ibm.com
Subject: Re: Dumb question: BKL on reboot ?
Date: Sun, 24 Aug 2003 23:43:16 +0200 [thread overview]
Message-ID: <20030824214316.GA1460@dualathlon.random> (raw)
In-Reply-To: <3F45BA87.1060902@suse.de>
Hi Hannes,
On Fri, Aug 22, 2003 at 08:39:03AM +0200, Hannes Reinecke wrote:
> Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you
> suggestion and see how far I'll progress.
Ulrich pointed me out that only cpu0 can call into the VM (thanks!), and
in turn the s390 code I tocuehd was infact correct (despite it looked
suspect given the trace and the special ==0 case). After a closer
inspection my (last ;) conclusion is this (cut-and-past from bugzilla)
----------------------
[..]
But the data.started counter is already enforcing a good enough
guarantee, that the CPU1 will execute
"signal_processor(smp_processor_id(), sigp_stop);" only when the CPU0 is
already executing inside the IPI handler. So I can't imagine the IPI on
CPU0 getting lost.
the last explanation I can think, is that the CPU0 executes the IPI,
waits for cpu_restart_map to become 0 (i.e. CPU1 offline), and
eventually executes this code correctly:
do_store_status();
/*
* Finally call reipl. Because we waited for all other
* cpus to enter this function we know that they do
* not hold any s390irq-locks (the cpus have been
* interrupted by an external interrupt and s390irq
* locks are always held disabled).
*/
reipl(S390_lowcore.ipl_device);
}
signal_processor(smp_processor_id(), sigp_stop);
but the box doesn't reboot and the CPU0 returns from the IPI and goes
idle until kupdate kicks in (finds the first idle cpu and tries to take
the big kernel lock that is correctly held by CPU1 in sys_reboot).
This seems a bug in the s390 lowlevel code above, it just doesn't reboot
the machine.
----------------------
Maybe it's related to signal_processor(smp_processor_id(), sigp_stop)
failing inside IPI handlers or whatever similar arch specific, dunno.
The patch removing the BKL from sys_reboot shouldn't help either if my
above theory is correct: when the IPI runs in cpu0, it's not even
running on top of the BKL. It's just that the machine not rebooting,
eventually will have the first idle cpu (cpu0) execute kupdate in mean
in 2.5 sec, and it will eventually go in the state you found in lkcd
since the BKL is held by cpu1 in sys_reboot that is already offline (no
valid EIP in lkcd).
So if you want to test, probably one interesting info you could generate
to confirm my above theory, is to reproduce the very same deadlock even
with the BKL removal patch applied to sys_reboot. Not sure how easy it
is to reproduce it. If you can hang it again, it would not deadlock in
kupdate anymore: you should find cpu0 stuck in the idle loop instead of
sync_old_buffers.
It maybe something slightly different going wrong too, but still I'm
convinced the lock_kernel in sys_reboot is absoltely innocent and needed
(at least in 2.4).
Andrea
next prev parent reply other threads:[~2003-08-24 21:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-08-20 10:22 Dumb question: BKL on reboot ? Hannes Reinecke
2003-08-20 18:29 ` Andrew Morton
2003-08-20 18:35 ` David S. Miller
2003-08-20 20:23 ` Dave Hansen
2003-08-21 8:05 ` Hannes Reinecke
2003-08-21 15:41 ` Andrea Arcangeli
2003-08-21 15:55 ` Hannes Reinecke
2003-08-21 16:39 ` Andrea Arcangeli
2003-08-21 16:58 ` Andrea Arcangeli
2003-08-22 6:39 ` Hannes Reinecke
2003-08-22 13:57 ` Andrea Arcangeli
2003-08-24 21:43 ` Andrea Arcangeli [this message]
2003-08-21 15:33 ` Andrea Arcangeli
[not found] <3F434BD1.9050704@suse.de.suse.lists.linux.kernel>
2003-08-20 10:49 ` Andi Kleen
2003-08-20 11:51 ` Hannes Reinecke
2003-08-20 12:03 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030824214316.GA1460@dualathlon.random \
--to=andrea@suse.de \
--cc=Hannes.Reinecke@suse.de \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=uweigand@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox