From: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v3 0/4] Fixes for 3 separate NMI reentrancy bugs
Date: Tue, 26 Feb 2019 12:21:10 +0530 [thread overview]
Message-ID: <20190226065110.GA13164@sathnaga86.in.ibm.com> (raw)
In-Reply-To: <20190226060901.18715-1-npiggin@gmail.com>
On Tue, Feb 26, 2019 at 04:08:57PM +1000, Nicholas Piggin wrote:
> This series fixes several similar but unrelated bugs with NMIs
> clobbering live registers without noticing it, because MSR[RI] is set.
> Pretty rare bugs, but serious silent corruption consequences.
>
> For the most part these can be observed and tested quite easily
> with the mambo simulator, except that it does not seem to follow
> the architecture wrt leaving MSR[RI] unchanged for HV interrupts.
> Mambo clears MSR[RI], so you have to account for that manually.
>
> Since v1:
> - Fixed several build bugs.
>
> Since v2:
> - Improved changelog and comments.
> - Fixed the NIA test for virt mode interrupts.
Hit with below crash on Power8 box, patch built with linuxppc merge branch with `ppc64le_defconfig`
UnknownStateTransition: Something happened system state="8" and we transitioned to UNKNOWN state. Review the following for more details
Message="OpTestSystem in run_IPLing and Exception="Kernel OOPS (machine in state '5'): Oops: Kernel access of bad area, sig: 11 [#1]
[ 0.000000] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7-gf46b87021 #1
[ 0.000000] NIP: c000000000c1306c LR: c000000000c12f64 CTR: c00000000033d860
[ 0.000000] REGS: c0000000014878b0 TRAP: 0380 Not tainted (5.0.0-rc7-gf46b87021)
[ 0.000000] MSR: 9000000000001033 <SF,HV,ME,IR,DR,RI,LE> CR: 28002224 XER: 00000000
[ 0.000000] CFAR: c000000000c12f7c IRQMASK: 1
[ 0.000000] GPR00: c000000000c12f64 c000000001487b40 c000000001488400 f000000000000000
[ 0.000000] GPR04: c000000001487b18 c000000001487b20 0000000000000000 c000000001388400
[ 0.000000] GPR08: f000000000000000 f000000000000008 0000000000000000 0000000800000000
[ 0.000000] GPR12: c0000000015e1ed0 c000000001670000 0000000000000000 0000000000000000
[ 0.000000] GPR16: 0000000000000000 0000000000000000 c0000000015e0d40 0000000000000001
[ 0.000000] GPR20: ffffffffffffffff ffffffffffffffff 0000000008000000 c000000001413b90
[ 0.000000] GPR24: c000000001413b98 007ffff000000000 0000000000080000 0000000000000000
[ 0.000000] GPR28: 0000000000000000 0000000000000000 007ffff000001000 0000000000000000
[ 0.000000] NIP [c000000000c1306c] memmap_init_zone+0x258/0x308
[ 0.000000] LR [c000000000c12f64] memmap_init_zone+0x150/0x308
[ 0.000000] Call Trace:
[ 0.000000] [c000000001487b40] [c000000000c12f64] memmap_init_zone+0x150/0x308 (unreliable)
[ 0.000000] [c000000001487be0] [c000000000f87acc] free_area_init_node+0x480/0x518
[ 0.000000] [c000000001487cf0] [c000000000f88630] free_area_init_nodes+0x838/0x940
[ 0.000000] [c000000001487e10] [c000000000f6340c] paging_init+0x8c/0xa8
[ 0.000000] [c000000001487e80] [c000000000f5bc00] setup_arch+0x3b4/0x3f0
[ 0.000000] [c000000001487ef0] [c000000000f53b68] start_kernel+0x94/0x630
[ 0.000000] [c000000001487f90] [c00000000000b37c] start_here_common+0x1c/0x520
[ 0.000000] Instruction dump:
[ 0.000000] 71290002 41820014 ebea0008 7cc6fa14 78df8402 48000070 3d22000c 7bea3664
[ 0.000000] 39299d20 e9090000 7c685214 39230008 <fa290010> fa290018 fa290020 fa290030
[ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x40/0x80 with crng_init=0
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000]
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Rebooting in 10 seconds" caused the system to go to UNKNOWN_BAD and the system will be stopping."
Regards,
-Satheesh.
>
> Nicholas Piggin (4):
> powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
> powerpc/64s: system reset interrupt preserve HSRRs
> powerpc/64s: Prepare to handle data interrupts vs d-side MCE
> reentrancy
> powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
>
> arch/powerpc/include/asm/asm-prototypes.h | 8 ++
> arch/powerpc/include/asm/nmi.h | 2 +
> arch/powerpc/kernel/exceptions-64s.S | 92 +++++++++++++++++++----
> arch/powerpc/kernel/mce.c | 3 +
> arch/powerpc/kernel/traps.c | 91 +++++++++++++++++++++-
> 5 files changed, 179 insertions(+), 17 deletions(-)
>
> --
> 2.18.0
>
prev parent reply other threads:[~2019-02-26 6:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-26 6:08 [PATCH v3 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-02-26 6:08 ` [PATCH v3 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
2019-02-26 6:08 ` [PATCH v3 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
2019-02-26 6:09 ` [PATCH v3 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
2019-02-26 6:09 ` [PATCH v3 4/4] powerpc/64s: Fix " Nicholas Piggin
2019-02-26 6:51 ` Satheesh Rajendran [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190226065110.GA13164@sathnaga86.in.ibm.com \
--to=sathnaga@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).