From: ebiederm@xmission.com (Eric W. Biederman)
To: Cliff Wickman <cpw@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>,
andi@firstfloor.org, tglx@linutronix.de,
linux-kernel@vger.kernel.org,
the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [PATCH] X86: reboot-notify additions
Date: Thu, 19 Jun 2008 14:50:49 -0700 [thread overview]
Message-ID: <m1d4mdksau.fsf@frodo.ebiederm.org> (raw)
In-Reply-To: <20080619145453.GA11929@sgi.com> (Cliff Wickman's message of "Thu, 19 Jun 2008 09:54:53 -0500")
Cliff Wickman <cpw@sgi.com> writes:
> On Thu, Jun 19, 2008 at 01:02:14PM +0200, Ingo Molnar wrote:
>>
>> * Cliff Wickman <cpw@sgi.com> wrote:
>>
>> > From: Cliff Wickman <cpw@sgi.com>
>> >
>> > X86 reboot-notify additions.
>
> As Andi Kleen pointed out, this is not X86-specific. (it started out
> to be, but what I hoped to achieve for X86 turns out to be generic).
>
>> > This patch adds scans of the "reboot_notifier_list" callback chain in
>> > a three other places where the kernel is being stopped and/or restarted.
>> >
>> > Adds calls to blocking_notifier_call_chain() in:
>> > crash_kexec(), emergency_restart(), sys_kexec_load()
>> >
>> > In the crash_kexec() and emergency_restart() cases it is indicated to the
>> > called-back function that the system is not in a sane state, so that
>> > it can avoid taking a lock or some such potentially blocking action.
>> >
>> > These callbacks are important to a partition system. The stopped kernel
> needs
>> > to inform other partitions of their need to disconnect (stop sharing
> memory).
>> >
>> > Diffed against 2.6.26-rc6
>> >
>> > Signed-off-by: Cliff Wickman <cpw@sgi.com>
>> > ---
>> > include/linux/notifier.h | 4 ++++
>> > kernel/kexec.c | 5 +++++
>> > kernel/sys.c | 1 +
>> > 3 files changed, 10 insertions(+)
>> >
>> > Index: linux/include/linux/notifier.h
>> > ===================================================================
>> > --- linux.orig/include/linux/notifier.h
>> > +++ linux/include/linux/notifier.h
>> > @@ -202,6 +202,10 @@ static inline int notifier_to_errno(int
>> > #define SYS_RESTART SYS_DOWN
>> > #define SYS_HALT 0x0002 /* Notify of system halt */
>> > #define SYS_POWER_OFF 0x0003 /* Notify of system power off */
>> > +#define SYS_INSANE 0x0004 /* Notify of system error/panic/oops */
>> > +/* For the SYS_INSANE case, no locks should be taken by the called-back
>> > + * function. The kernel is ready for an immediate reboot.
>> > + */
>> >
>> > #define NETLINK_URELEASE 0x0001 /* Unicast netlink socket released */
>> >
>> > Index: linux/kernel/kexec.c
>> > ===================================================================
>> > --- linux.orig/kernel/kexec.c
>> > +++ linux/kernel/kexec.c
>> > @@ -1001,6 +1001,9 @@ asmlinkage long sys_kexec_load(unsigned
>> > if (result)
>> > goto out;
>> > }
>> > +
>> > + blocking_notifier_call_chain(&reboot_notifier_list, SYS_RESTART, NULL);
>> > +
>> > /* Install the new kernel, and Uninstall the old */
>> > image = xchg(dest_image, image);
>> >
>> > @@ -1068,6 +1071,8 @@ void crash_kexec(struct pt_regs *regs)
>> > if (!locked) {
>> > if (kexec_crash_image) {
>> > struct pt_regs fixed_regs;
>> > + blocking_notifier_call_chain(&reboot_notifier_list,
>> > + SYS_INSANE, NULL);
>> > crash_setup_regs(&fixed_regs, regs);
>> > crash_save_vmcoreinfo();
>> > machine_crash_shutdown(&fixed_regs);
>> > Index: linux/kernel/sys.c
>> > ===================================================================
>> > --- linux.orig/kernel/sys.c
>> > +++ linux/kernel/sys.c
>> > @@ -270,6 +270,7 @@ out_unlock:
>> > */
>> > void emergency_restart(void)
>> > {
>> > + blocking_notifier_call_chain(&reboot_notifier_list, SYS_INSANE, NULL);
>> > machine_emergency_restart();
>> > }
>> > EXPORT_SYMBOL_GPL(emergency_restart);
>>
>> i dont think this is a good idea. reboot_notifier_list is a blocking
>> notifier, i.e. it comes with a notifier->rwsem read-write mutex that is
>> taken when blocking_notifier_call_chain() is executed.
>>
>> i.e. this patch puts a sleeping mutex operation (a down_read()) into a
>> highly critical code path of the kernel. This will decrease the
>> reliability of the kernel.
>
> Andi pointed this out, too.
>
> For these emergency cases (I'll change "SYS_INSANE" to "SYS_EMERGENCY")
> I probably should be using raw_notifier_call_chain(), which requires
> a slightly different form of list header but doesn't try to protect
> against someone else adding to the notifier list.
>>
>> what exactly are you trying to achieve?
>>
>> Ingo
>
> The impetus for these additions is to call back a driver in every case
> that the kernel is going down. In a partitioned system we need such a
> driver to inform all other partitions that they need to disconnect from
> the rebooting/halting/panicing partition (kernel image). If they are not
> informed, they may bring themselves crashing down as well.
> (xpc is such a cross_partition driver)
Cool. Someone who wants this kind of functionality and has code in
the kernel. Perhaps we can have a reasonable discussion about this.
The last time this came up people wanted a hook so they could support
their out of tree blobs in an enterprise kernel.
emergency_restart only happens or only should happen with explicit admin
request Sysreq-r. Or when a watchdog detects the system is borked.
By design it is not expected to call drivers. The kexec on panic
case is similar.
sys_kexec_load is a ridiculous place to call any kind of reboot notifier
because it is a prep function that doesn't require any kind of connection
to rebooting.
As far as this being a generic problem I half agree. It seems to depend
on your platform if you need something like this.
With that said. I suggest we have a single platform specific function
that can be called. Possibly something like pm_power_off. It is
insanely important that we be able to audit all of the code on these
code paths, and that a random loadable driver not be able to get in
and mess things up.
Eric
next prev parent reply other threads:[~2008-06-19 22:01 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-18 22:03 [PATCH] X86: reboot-notify additions Cliff Wickman
2008-06-19 3:02 ` Andi Kleen
2008-06-19 11:02 ` Ingo Molnar
2008-06-19 14:54 ` Cliff Wickman
2008-06-19 21:50 ` Eric W. Biederman [this message]
2008-06-20 15:16 ` Cliff Wickman
2008-06-20 15:43 ` Ingo Molnar
2008-06-20 18:01 ` Eric W. Biederman
2008-06-20 19:18 ` Cliff Wickman
2008-06-20 21:33 ` Eric W. Biederman
2008-06-20 11:45 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1d4mdksau.fsf@frodo.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=andi@firstfloor.org \
--cc=cpw@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox