From: Nicholas Piggin <npiggin@gmail.com>
To: Christophe LEROY <christophe.leroy@c-s.fr>
Cc: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 3/3] powerpc: machine check interrupt is a non-maskable interrupt
Date: Tue, 9 Oct 2018 15:30:58 +1000 [thread overview]
Message-ID: <20181009153058.2564e7a1@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <ccf61fc2-1f2e-7b67-c481-fa4c00a65aae@c-s.fr>
On Tue, 9 Oct 2018 06:46:30 +0200
Christophe LEROY <christophe.leroy@c-s.fr> wrote:
> Le 09/10/2018 à 06:32, Nicholas Piggin a écrit :
> > On Mon, 8 Oct 2018 17:39:11 +0200
> > Christophe LEROY <christophe.leroy@c-s.fr> wrote:
> >
> >> Hi Nick,
> >>
> >> Le 19/07/2017 à 08:59, Nicholas Piggin a écrit :
> >>> Use nmi_enter similarly to system reset interrupts. This uses NMI
> >>> printk NMI buffers and turns off various debugging facilities that
> >>> helps avoid tripping on ourselves or other CPUs.
> >>>
> >>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >>> ---
> >>> arch/powerpc/kernel/traps.c | 9 ++++++---
> >>> 1 file changed, 6 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> >>> index 2849c4f50324..6d31f9d7c333 100644
> >>> --- a/arch/powerpc/kernel/traps.c
> >>> +++ b/arch/powerpc/kernel/traps.c
> >>> @@ -789,8 +789,10 @@ int machine_check_generic(struct pt_regs *regs)
> >>>
> >>> void machine_check_exception(struct pt_regs *regs)
> >>> {
> >>> - enum ctx_state prev_state = exception_enter();
> >>> int recover = 0;
> >>> + bool nested = in_nmi();
> >>> + if (!nested)
> >>> + nmi_enter();
> >>
> >> This alters preempt_count, then when die() is called
> >> in_interrupt() returns true allthough the trap didn't happen in
> >> interrupt, so oops_end() panics for "fatal exception in interrupt"
> >> instead of gently sending SIGBUS the faulting app.
> >
> > Thanks for tracking that down.
> >
> >> Any idea on how to fix this ?
> >
> > I would say we have to deliver the sigbus by hand.
> >
> > if ((user_mode(regs)))
> > _exception(SIGBUS, regs, BUS_MCEERR_AR, regs->nip);
> > else
> > die("Machine check", regs, SIGBUS);
> >
>
> And what about all the other things done by 'die()' ?
>
> And what if it is a kernel thread ?
>
> In one of my boards, I have a kernel thread regularly checking the HW,
> and if it gets a machine check I expect it to gently stop and the die
> notification to be delivered to all registered notifiers.
>
> Until before this patch, it was working well.
I guess the alternative is we could check regs->trap for machine
check in the die test. Complication is having to account for MCE
in an interrupt handler.
if (in_interrupt()) {
if (!IS_MCHECK_EXC(regs) || (irq_count() - (NMI_OFFSET + HARDIRQ_OFFSET)))
panic("Fatal exception in interrupt");
}
Something like that might work for you? We needs a ppc64 macro for the
MCE, and can probably add something like in_nmi_from_interrupt() for
the second part of the test.
Thanks,
Nick
next prev parent reply other threads:[~2018-10-09 5:33 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-19 6:59 [PATCH v2 0/3] machine check handling improvements Nicholas Piggin
2017-07-19 6:59 ` [PATCH v2 1/3] powerpc/powernv: handle the platform error reboot in ppc_md.restart Nicholas Piggin
2017-07-19 7:16 ` Nicholas Piggin
2017-07-20 5:39 ` Mahesh Jagannath Salgaonkar
2017-08-31 11:36 ` [v2, " Michael Ellerman
2017-07-19 6:59 ` [PATCH v2 2/3] powerpc/powernv: machine check use kernel crash path Nicholas Piggin
2017-07-20 7:14 ` Mahesh Jagannath Salgaonkar
2017-07-19 6:59 ` [PATCH v2 3/3] powerpc: machine check interrupt is a non-maskable interrupt Nicholas Piggin
2018-10-08 15:39 ` Christophe LEROY
2018-10-09 4:32 ` Nicholas Piggin
2018-10-09 4:46 ` Christophe LEROY
2018-10-09 5:30 ` Nicholas Piggin [this message]
2018-10-09 9:36 ` Christophe Leroy
2018-10-09 11:16 ` Nicholas Piggin
2018-10-09 12:01 ` Christophe LEROY
2018-10-09 12:14 ` Nicholas Piggin
2018-10-11 14:23 ` Christophe LEROY
2018-10-11 14:31 ` Christophe LEROY
2018-10-13 8:29 ` Christophe Leroy
2018-10-13 8:48 ` Nicholas Piggin
2018-10-13 8:56 ` Christophe LEROY
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181009153058.2564e7a1@roar.ozlabs.ibm.com \
--to=npiggin@gmail.com \
--cc=christophe.leroy@c-s.fr \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).