All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <bsingharora@gmail.com>
To: Michael Neuling <mikey@neuling.org>
Cc: mpe@ellerman.id.au, Vipin K Parashar <vipin@linux.vnet.ibm.com>,
	Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powernv: Avoid checkstop on HMI and MCE
Date: Wed, 25 Oct 2017 23:56:48 +1100	[thread overview]
Message-ID: <20171025235648.1152bff3@MiWiFi-R3-srv> (raw)
In-Reply-To: <20171024092005.3861-1-mikey@neuling.org>

On Tue, 24 Oct 2017 20:20:05 +1100
Michael Neuling <mikey@neuling.org> wrote:

> On an unrecoverable HMI or MCE only generate an checkstop (via
> PLATFORM ERROR opal reboot call) when panic_on_oops is set.
> 
> We currently generate an checkstop as an attempt for the FSP to grab a
> dump and then reboot us. Unfortunately this never works and no one
> I've talked to has ever seen a resulting dump, let alone got useful
> information from it.
> 
> Even worse, the checkstop gets in the way of debugging real
> problems. If we hit a software bug that results in this, we get no
> opportunity to debug it live. Similarly if the bug is due to hardware
> that is not in the dump (say PCI or NVLINK GPU), we get no information
> in the dump about that hardware.
> 
> So let's remove it unless someone sets panic_on_oops.
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> ---
>  arch/powerpc/platforms/powernv/opal-hmi.c | 6 ++++++
>  arch/powerpc/platforms/powernv/opal.c     | 4 ++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/opal-hmi.c b/arch/powerpc/platforms/powernv/opal-hmi.c
> index c9e1a4ff29..23780970d0 100644
> --- a/arch/powerpc/platforms/powernv/opal-hmi.c
> +++ b/arch/powerpc/platforms/powernv/opal-hmi.c
> @@ -29,6 +29,7 @@
>  #include <asm/opal.h>
>  #include <asm/cputable.h>
>  #include <asm/machdep.h>
> +#include <asm/bug.h>
>  
>  #include "powernv.h"
>  
> @@ -284,6 +285,11 @@ static void hmi_event_handler(struct work_struct *work)
>  			print_hmi_event_info(hmi_evt);
>  		}
>  
> +		if (!panic_on_oops) {
> +			die("Unrecoverable HMI exception", NULL, SIGBUS);
> +			return;
> +		}
> +

If panic_on_oops is set, we checkstop, not panic! Passing NULL to die, will
cause arch_uprobe_exception_notify() to complain.

We could respin this a bit and I can send an updated patch if there is interest


Balbir Singh.

      parent reply	other threads:[~2017-10-25 12:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-24  9:20 [PATCH] powernv: Avoid checkstop on HMI and MCE Michael Neuling
2017-10-25 10:16 ` Michael Ellerman
2017-10-25 10:59   ` Michael Neuling
2017-10-25 11:24     ` Nicholas Piggin
2017-10-25 12:56 ` Balbir Singh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171025235648.1152bff3@MiWiFi-R3-srv \
    --to=bsingharora@gmail.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=mikey@neuling.org \
    --cc=mpe@ellerman.id.au \
    --cc=vipin@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.