public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: Andi Kleen <ak@suse.de>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>, "Raj, Ashok" <ashok.raj@intel.com>,
	Stephen Hemminger <shemminger@osdl.org>
Subject: Re: [PATCH] i386/x86_64: Don't IPI to offline cpus on shutdown
Date: Sun, 27 Nov 2005 19:27:50 -0700	[thread overview]
Message-ID: <m1psolfqvt.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0511270836120.20046@montezuma.fsmlabs.com> (Zwane Mwaikambo's message of "Sun, 27 Nov 2005 08:45:47 -0800 (PST)")

Zwane Mwaikambo <zwane@arm.linux.org.uk> writes:

> On Sun, 27 Nov 2005, Eric W. Biederman wrote:
>
>> Andi Kleen <ak@suse.de> writes:
>> 
>> > On Sun, Nov 27, 2005 at 02:05:45AM -0800, Zwane Mwaikambo wrote:
>> >> http://bugzilla.kernel.org/show_bug.cgi?id=5203
>> >> 
>> >> There is a small race during SMP shutdown between the processor issuing 
>> >> the shutdown and the other processors clearing themselves off the 
>> >> cpu_online_map as they do this without using the normal cpu offline 
>> >> synchronisation. To avoid this we should wait for all the other processors
>> >> to clear their corresponding bits and then proceed. This way we can safely
>> >> make the cpu_online test in smp_send_reschedule, it's safe during normal 
>> >> runtime as smp_send_reschedule is called with a lock held / preemption 
>> >> disabled.
>> >
>> > Looking at the backtrace in the bug - how can sys_reboot call do_exit???
>> > I would say the problem is in whatever causes that. It shouldn't 
>> > do that. sys_reboot shouldn't schedule, it's that simple.
>> > Your patch is just papering over that real bug.
>> 
>> sys_reboot in the case of halt (after everything else is done)
>> has directly called do_exit for years.
>> 
>> There are some very subtle interactions there.  The one
>> I always remember (having found it the hard way) is that
>> interrupts must be left on so ctrl-alt-del still works.
>> 
>> This do_exit may be something similar, although I can't
>> think of how it could be useful.
>
> I wondered the same thing, perhaps i am papering over the bug, what 
> exactly is do_exit doing there?

So I just looked at a couple of kernels I have lying around.
the do_exit has been in halt since 1.2.?  In 1.0 the halt
case was not even implemented.  

So I think not coping with the do_exit is a problem, as every
other architecture and every other kernel has been able to.

To handle ctrl-alt-del after a halt (or any other action) being able
to schedule sounds important.

As Andi Kleen indicated the problem with do_exit is that it schedules.
I strongly disagree that simply scheduling after halting is a bug.

So why are we calling smp_send_stop from machine_halt?

We don't.

Looking more closely at the bug report the problem here
is that halt -p is called which triggers not a halt but
an attempt to power off.  

machine_power_off calls machine_shutdown which calls smp_send_stop.

If pm_power_off is set we should never make it out machine_power_off
to the call of do_exit.  So pm_power_off must not be set in this case.
When pm_power_off is not set we expect machine_power_off to devolve
into machine_halt.

So how do we fix this?

Playing too much with smp_send_stop is dangerous because it
must also be safe to be called from panic.

It looks like the obviously correct fix is to only call
machine_shutdown when pm_power_off is defined.  Doing
that will make Andi's assumption about not scheduling
true and generally simplify what must be supported.

This turns machine_power_off into a noop like machine_halt
when pm_power_off is not defined.

If the expected behavior is that sys_reboot(LINUX_REBOOT_CMD_POWER_OFF)
becomes sys_reboot(LINUX_REBOOT_CMD_HALT) if pm_power_off is NULL
this is not quite a comprehensive fix as we pass a different parameter
to the reboot notifier and we set system_state to a different value
before calling device_shutdown().

Unfortunately any fix more comprehensive I can think of is not
obviously correct.  The core problem is that there is no architecture
independent way to detect if machine_power will become a noop, without
calling it.

Eric

diff --git a/arch/i386/kernel/reboot.c b/arch/i386/kernel/reboot.c
index 350ea66..2a8cb44 100644
--- a/arch/i386/kernel/reboot.c
+++ b/arch/i386/kernel/reboot.c
@@ -12,6 +12,7 @@
 #include <linux/efi.h>
 #include <linux/dmi.h>
 #include <linux/ctype.h>
+#include <linux/pm.h>
 #include <asm/uaccess.h>
 #include <asm/apic.h>
 #include <asm/desc.h>
@@ -347,10 +348,10 @@ void machine_halt(void)
 
 void machine_power_off(void)
 {
-	machine_shutdown();
-
-	if (pm_power_off)
+	if (pm_power_off) {
+		machine_shutdown();
 		pm_power_off();
+	}
 }
 
 
diff --git a/arch/x86_64/kernel/reboot.c b/arch/x86_64/kernel/reboot.c
index 75235ed..57117b8 100644
--- a/arch/x86_64/kernel/reboot.c
+++ b/arch/x86_64/kernel/reboot.c
@@ -6,6 +6,7 @@
 #include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/string.h>
+#include <linux/pm.h>
 #include <asm/io.h>
 #include <asm/kdebug.h>
 #include <asm/delay.h>
@@ -154,10 +155,11 @@ void machine_halt(void)
 
 void machine_power_off(void)
 {
-	if (!reboot_force) {
-		machine_shutdown();
-	}
-	if (pm_power_off)
+	if (pm_power_off) {
+		if (!reboot_force) {
+			machine_shutdown();
+		}
 		pm_power_off();
+	}
 }
 

  reply	other threads:[~2005-11-28  2:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-27 10:05 [PATCH] i386/x86_64: Don't IPI to offline cpus on shutdown Zwane Mwaikambo
2005-11-27 13:58 ` Andi Kleen
2005-11-27 14:14   ` Eric W. Biederman
2005-11-27 16:45     ` Zwane Mwaikambo
2005-11-28  2:27       ` Eric W. Biederman [this message]
2005-11-28  5:33         ` Andi Kleen
2005-11-28 17:29         ` Zwane Mwaikambo
     [not found]         ` <Pine.LNX.4.64.0512021221210.13220@montezuma.fsmlabs.com>
     [not found]           ` <m1iru7dlww.fsf@ebiederm.dsl.xmission.com>
     [not found]             ` <Pine.LNX.4.64.0512050014000.6637@montezuma.fsmlabs.com>
     [not found]               ` <m1zmncb0n5.fsf@ebiederm.dsl.xmission.com>
     [not found]                 ` <Pine.LNX.4.64.0512072158500.2557@montezuma.fsmlabs.com>
     [not found]                   ` <m1vey0azeu.fsf@ebiederm.dsl.xmission.com>
     [not found]                     ` <Pine.LNX.4.64.0512072249000.2557@montezuma.fsmlabs.com>
2005-12-08  7:39                       ` [PATCH] Don't attempt to power off if power off is not implemented Eric W. Biederman
1970-01-02  3:13                         ` Pavel Machek
2005-12-14 17:12                           ` Eric W. Biederman
2005-12-11 22:57                         ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1psolfqvt.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=ashok.raj@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shemminger@osdl.org \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox