All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cliff Wickman <cpw@sgi.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	andi@firstfloor.org, tglx@linutronix.de,
	linux-kernel@vger.kernel.org,
	the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [PATCH] X86: reboot-notify additions
Date: Fri, 20 Jun 2008 10:16:50 -0500	[thread overview]
Message-ID: <20080620151650.GA5606@sgi.com> (raw)
In-Reply-To: <m1d4mdksau.fsf@frodo.ebiederm.org>

On Thu, Jun 19, 2008 at 02:50:49PM -0700, Eric W. Biederman wrote:
> Cliff Wickman <cpw@sgi.com> writes:
> 
> >> > This patch adds scans of the "reboot_notifier_list" callback chain in
> >> > a three other places where the kernel is being stopped and/or restarted.
> >> > 
> >> > Adds calls to blocking_notifier_call_chain() in:
> >> >    crash_kexec(), emergency_restart(), sys_kexec_load()
> >> > 
> >> > In the crash_kexec() and emergency_restart() cases it is indicated to the
> >> > called-back function that the system is not in a sane state, so that
> >> > it can avoid taking a lock or some such potentially blocking action.
> >> > 
> >> > These callbacks are important to a partition system. The stopped kernel
> > needs
> >> > to inform other partitions of their need to disconnect (stop sharing
> > memory).
> >> > 
> >> > Index: linux/kernel/kexec.c
> >> > ===================================================================
> >> > --- linux.orig/kernel/kexec.c
> >> > +++ linux/kernel/kexec.c
> >> > @@ -1001,6 +1001,9 @@ asmlinkage long sys_kexec_load(unsigned 
> >> >  		if (result)
> >> >  			goto out;
> >> >  	}
> >> > +
> >> > +	blocking_notifier_call_chain(&reboot_notifier_list, SYS_RESTART, NULL);
> >> > +
> >> >  	/* Install the new kernel, and  Uninstall the old */
> >> >  	image = xchg(dest_image, image);
> >> >  

I withdraw the above chunk.  As Eric pointed out, sys_kexec_load() is
not where the new kernel is started.  That is done in kernel_kexec()
which already runs the reboot_notifier_list.

> >> i dont think this is a good idea. reboot_notifier_list is a blocking 
> >> notifier, i.e. it comes with a notifier->rwsem read-write mutex that is 
> >> taken when blocking_notifier_call_chain() is executed.
> >> 
> >> i.e. this patch puts a sleeping mutex operation (a down_read()) into a 
> >> highly critical code path of the kernel. This will decrease the 
> >> reliability of the kernel.
> >
> > Andi pointed this out, too.
> >
> > For these emergency cases (I'll change "SYS_INSANE" to "SYS_EMERGENCY")
> > I probably should be using raw_notifier_call_chain(), which requires
> > a slightly different form of list header but doesn't try to protect
> > against someone else adding to the notifier list.
> >> 
> >> what exactly are you trying to achieve?
> >> 
> >> 	Ingo
> >
> > The impetus for these additions is to call back a driver in every case
> > that the kernel is going down.  In a partitioned system we need such a
> > driver to inform all other partitions that they need to disconnect from
> > the rebooting/halting/panicing partition (kernel image).  If they are not
> > informed, they may bring themselves crashing down as well.
> > (xpc is such a cross_partition driver)
> 
> Cool.  Someone who wants this kind of functionality and has code in
> the kernel.  Perhaps we can have a reasonable discussion about this.
> The last time this came up people wanted a hook so they could support
> their out of tree blobs in an enterprise kernel.
> 
> emergency_restart only happens or only should happen with explicit admin
> request Sysreq-r.  Or when a watchdog detects the system is borked.
> By design it is not expected to call drivers.  The kexec on panic
> case is similar.

I suppose one could trust that someone with superuser permission would
not stop one partition of a multi-partitioned system in a cavalier manner.
I'm inclined to think we should run the reboot_notifier_list even in those
situations.

But definitely on some watchdog timeout event.  Some kind of mechanism
should be invoked to communicate the stoppage.
 
> sys_kexec_load is a ridiculous place to call any kind of reboot notifier
> because it is a prep function that doesn't require any kind of connection
> to rebooting.

agree
done
 
> As far as this being a generic problem I half agree.  It seems to depend
> on your platform if you need something like this.
> 
> With that said.  I suggest we have a single platform specific function 
> that can be called.  Possibly something like pm_power_off.  It is
> insanely important that we be able to audit all of the code on these
> code paths, and that a random loadable driver not be able to get in
> and mess things up.

The panic() function has the panic_notifier_list for those cases where
crash_kexec() does not find a crash kernel to exec.

That leaves holes for watchdog-type events and crash_kexec().
Can you elborate on the problem with running a non-blocking scan of 
the reboot_notifier_list in those situations?

What do you have in mind as a platform specific function, that would
be an improvement over the reboot_notifier_list?




My current (v2) proposed patch for using the reboot_notifier_list as
this mechanism looks like this:
(and I'm not sure if using atomic_notifier_call_chain() might be a better
 alternative to raw_notifier_call_chain())


Subject: [PATCHv2] reboot-notify additions

reboot-notify additions

This patch adds scans of the "reboot_notifier_list" callback chain in
the remaining places where the kernel is being stopped and/or restarted.

Adds 2 calls to raw_notifier_call_chain() in:
   crash_kexec(), emergency_restart()


Diffed against 2.6.26-rc6

Signed-off-by: Cliff Wickman <cpw@sgi.com>
---
 include/linux/notifier.h |    5 +++++
 kernel/kexec.c           |    6 ++++++
 kernel/sys.c             |    7 +++++++
 3 files changed, 18 insertions(+)

Index: linux/include/linux/notifier.h
===================================================================
--- linux.orig/include/linux/notifier.h
+++ linux/include/linux/notifier.h
@@ -202,6 +202,11 @@ static inline int notifier_to_errno(int 
 #define SYS_RESTART	SYS_DOWN
 #define SYS_HALT	0x0002	/* Notify of system halt */
 #define SYS_POWER_OFF	0x0003	/* Notify of system power off */
+#define SYS_EMERGENCY	0x0004	/* Notify of system error/panic/oops */
+/*
+ * For the SYS_EMERGENCY case, no locks should be taken by the called-back
+ * function.
+ */
 
 #define NETLINK_URELEASE	0x0001	/* Unicast netlink socket released */
 
Index: linux/kernel/kexec.c
===================================================================
--- linux.orig/kernel/kexec.c
+++ linux/kernel/kexec.c
@@ -1063,11 +1063,17 @@ void crash_kexec(struct pt_regs *regs)
 	 * If the crash kernel was not located in a fixed area
 	 * of memory the xchg(&kexec_crash_image) would be
 	 * sufficient.  But since I reuse the memory...
+	 *
+	 * The reboot_notifier_list uses a header for a blocking-form scan.
+	 * Use a local header suitable for a non-blocking scan.
 	 */
 	locked = xchg(&kexec_lock, 1);
 	if (!locked) {
 		if (kexec_crash_image) {
 			struct pt_regs fixed_regs;
+        		struct raw_notifier_head rh;
+        		rh.head = reboot_notifier_list.head;
+        		raw_notifier_call_chain(&rh, SYS_EMERGENCY, NULL);
 			crash_setup_regs(&fixed_regs, regs);
 			crash_save_vmcoreinfo();
 			machine_crash_shutdown(&fixed_regs);
Index: linux/kernel/sys.c
===================================================================
--- linux.orig/kernel/sys.c
+++ linux/kernel/sys.c
@@ -267,9 +267,16 @@ out_unlock:
  *	reboot the system.  This is called when we know we are in
  *	trouble so this is our best effort to reboot.  This is
  *	safe to call in interrupt context.
+ *
+ *	The reboot_notifier_list uses a header for a blocking-form scan.
+ *	Use a local header suitable for a non-blocking scan.
  */
 void emergency_restart(void)
 {
+	struct raw_notifier_head rh;
+
+	rh.head = reboot_notifier_list.head;
+	raw_notifier_call_chain(&rh, SYS_EMERGENCY, NULL);
 	machine_emergency_restart();
 }
 EXPORT_SYMBOL_GPL(emergency_restart);

-- 
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824

  reply	other threads:[~2008-06-20 15:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-18 22:03 [PATCH] X86: reboot-notify additions Cliff Wickman
2008-06-19  3:02 ` Andi Kleen
2008-06-19 11:02 ` Ingo Molnar
2008-06-19 14:54   ` Cliff Wickman
2008-06-19 21:50     ` Eric W. Biederman
2008-06-20 15:16       ` Cliff Wickman [this message]
2008-06-20 15:43         ` Ingo Molnar
2008-06-20 18:01         ` Eric W. Biederman
2008-06-20 19:18           ` Cliff Wickman
2008-06-20 21:33             ` Eric W. Biederman
2008-06-20 11:45     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080620151650.GA5606@sgi.com \
    --to=cpw@sgi.com \
    --cc=andi@firstfloor.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.