From: Ganesh <ganeshgr@linux.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>, linuxppc-dev@lists.ozlabs.org
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Subject: Re: [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash
Date: Thu, 23 Sep 2021 23:52:16 +0530 [thread overview]
Message-ID: <de062f8e-e99b-04ec-5d9d-0c31d3cd4c2a@linux.ibm.com> (raw)
In-Reply-To: <20210922020247.209409-1-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3614 bytes --]
On 9/22/21 7:32 AM, Nicholas Piggin wrote:
> The machine check handler is not considered NMI on 64s. The early
> handler is the true NMI handler, and then it schedules the
> machine_check_exception handler to run when interrupts are enabled.
>
> This works fine except the case of an unrecoverable MCE, where the true
> NMI is taken when MSR[RI] is clear, it can not recover to schedule the
> next handler, so it calls machine_check_exception directly so something
> might be done about it.
>
> Calling an async handler from NMI context can result in irq state and
> other things getting corrupted. This can also trigger the BUG at
> arch/powerpc/include/asm/interrupt.h:168.
>
> Fix this by just making the 64s machine_check_exception handler an NMI
> like it is on other subarchs.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
Hi Nick,
If I inject control memory access error in LPAR on top of this patch
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210906084303.183921-1-ganeshgr@linux.ibm.com/
I see the following warning trace
WARNING: CPU: 130 PID: 7122 at arch/powerpc/include/asm/interrupt.h:319 machine_check_exception+0x310/0x340
Modules linked in:
CPU: 130 PID: 7122 Comm: inj_access_err Kdump: loaded Tainted: G M 5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty #22
NIP: c00000000002f980 LR: c00000000002f7e8 CTR: c000000000a31860
REGS: c0000039fe51bb20 TRAP: 0700 Tainted: G M (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 88000222 XER: 20040000
CFAR: c00000000002f844 IRQMASK: 0
GPR00: c00000000002f798 c0000039fe51bdc0 c0000000020d0000 0000000000000001
GPR04: 0000000000000000 4000000000000002 4000000000000000 00000000000019af
GPR08: 00000077e5ad0000 0000000000000000 c0000077ee16c700 0000000000000080
GPR12: 0000000088000222 c0000077ee16c700 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 c0000000020fecd8 0000000000000000
GPR28: 0000000000000000 0000000000000001 0000000000000001 c0000039fe51be80
NIP [c00000000002f980] machine_check_exception+0x310/0x340
LR [c00000000002f7e8] machine_check_exception+0x178/0x340
Call Trace:
[c0000039fe51bdc0] [c00000000002f798] machine_check_exception+0x128/0x340 (unreliable)
[c0000039fe51be10] [c0000000000086ec] machine_check_common+0x1ac/0x1b0
--- interrupt: 200 at 0x10000968
NIP: 0000000010000968 LR: 0000000010000958 CTR: 0000000000000000
REGS: c0000039fe51be80 TRAP: 0200 Tainted: G M (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
MSR: 8000000002a0f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 22000824 XER: 00000000
CFAR: 000000000000021c DAR: 00007fffb00c0000 DSISR: 02000008 IRQMASK: 0
GPR00: 0000000022000824 00007fffc9647770 0000000010027f00 00007fffb00c0000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 00007fffb00c0000 0000000000000001 0000000000000000
GPR12: 0000000000000000 00007fffb015a330 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000 000000001000085c
GPR28: 00007fffc9647d18 0000000000000001 00000000100009b0 00007fffc9647770
NIP [0000000010000968] 0x10000968
LR [0000000010000958] 0x10000958
--- interrupt: 200
[-- Attachment #2: Type: text/html, Size: 4152 bytes --]
prev parent reply other threads:[~2021-09-23 21:37 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-22 2:02 [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash Nicholas Piggin
2021-09-23 18:22 ` Ganesh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de062f8e-e99b-04ec-5d9d-0c31d3cd4c2a@linux.ibm.com \
--to=ganeshgr@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.vnet.ibm.com \
--cc=npiggin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.