From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752743AbbCWNvA (ORCPT <rfc822;w@1wt.eu>);
	Mon, 23 Mar 2015 09:51:00 -0400
Received: from mail-wi0-f173.google.com ([209.85.212.173]:37115 "EHLO
	mail-wi0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752631AbbCWNuv (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 23 Mar 2015 09:50:51 -0400
Date: Mon, 23 Mar 2015 14:50:46 +0100
From: Ingo Molnar <mingo@kernel.org>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>,
        =?utf-8?B?IkhhdGF5YW1hLCBEYWlzdWtlL+eVkeWxsSDlpKfovJQi?= 
	<d.hatayama@jp.fujitsu.com>,
        ebiederm@xmission.com, masami.hiramatsu.pt@hitachi.com,
        hidehiro.kawai.ez@hitachi.com, linux-kernel@vger.kernel.org,
        kexec@lists.infradead.org, akpm@linux-foundation.org, mingo@redhat.com,
        bp@suse.de
Subject: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers"
 option issue in oops path
Message-ID: <20150323135046.GA25012@gmail.com>
References: <54F9D645.2050008@jp.fujitsu.com>
 <20150323034752.GD2068@dhcp-16-105.nay.redhat.com>
 <20150323071943.GA22765@gmail.com>
 <20150323133710.GA3172@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20150323133710.GA3172@redhat.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Vivek Goyal <vgoyal@redhat.com> wrote:

> On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote:
> > 
> > * Baoquan He <bhe@redhat.com> wrote:
> > 
> > > CC more people ...
> > > 
> > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > > > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > > > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > > > kmsg or after.
> > > > 
> > > > The problem is that the commit overlooks panic_on_oops kernel boot
> > > > option. If it is enabled, crash_kexec() is called directly without
> > > > going through panic() in oops path.
> > > > 
> > > > To fix this issue, this patch adds a check to
> > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > > > 
> > > > Also, put a comment in kexec_should_crash() to explain not obvious
> > > > things on this patch.
> > > > 
> > > > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > > > Acked-by: Baoquan He <bhe@redhat.com>
> > > > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > > > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > > ---
> > > >  include/linux/kernel.h |  3 +++
> > > >  kernel/kexec.c         | 11 +++++++++++
> > > >  kernel/panic.c         |  2 +-
> > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > This is hack upon hack, but why was this crap merged in the first 
> > place?
> > 
> > I see two problems just by cursory review:
> > 
> > 1)
> > 
> > Firstly, the real bug in:
> > 
> >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> > 
> > Was that crash_kexec() was called unconditionally after notifiers were 
> > called, which should be fixed via the simple patch below (untested). 
> > Looks much simpler than your fix.
> > 
> 
> Hi Ingo,
> 
> Agreed. Your patch looks good.

In case you want that simpler fix and need my SOB:

  Signed-off-by: Ingo Molnar <mingo@kernel.org>

(but I have not tested it.)

> > Secondly, and more importantly, the whole premise of commit 
> > f06e5153f4ae is broken IMHO:
> > 
> >  "This can help rare situations where kdump fails because of unstable
> >   crashed kernel or hardware failure (memory corruption on critical
> >   data/code)"
> > 
> > wtf?
> > 
> > If the kernel crashed due to a kernel crash, then the kernel booting 
> > up in whatever hardware state should be able to do a clean bootup. The 
> > fix for those 'rare situations' should be to fix the real bug (for 
> > example by making hardware driver init (or deinit) sequences more 
> > robust), not to paper it over by ordering around crash-time sequences 
> > ...
> > 
> > If it crashed due to some hardware failure, there's literally an 
> > infinite amount of failure modes that may or may not be impacted by 
> > kexec crash-time handling ordering. We don't want to put a zillion 
> > such flags into the kernel proper just to allow the perturbation of 
> > the kernel.
> 
> I think one of the motivations behind this patch was call to kmsg_dump().
> Some vendors have been wanting to have the capability to save kernel logs
> to some NVRAM before transition to second kernel happens. Their argument
> is that kdump does not succeed all the time and if kdump does not succeed
> then atleast they have something to work with (kernel logs retrieved
> from pstore interface).

Doesn't pstore attach itself to printk itself? AFAICS it does:

 fs/pstore/platform.c:   register_console(&pstore_console);

so the printk log leading up to and including the crash should be 
available, regardless of this patch. What am I missing?

> Not that I agree fully with this as problem might happen while we 
> try to run panic_notifiers or kmsg_dump hooks and never transition 
> into kdump kernel.

btw., this is the big problem with 'notifiers' in general: they are 
opaque with barely any semantics defined, and a source of constant 
confusion.

> And it has been literally years since some developers have been 
> pushing for allowing to run panic notifiers before crash_kexec(). 
> Eric Biederman has been pushing back saying it reduces the 
> reliability of kdump operation so this is not acceptable.

So what do those notifiers do?

Thanks,

	Ingo