* [RESEND PATCH] For machine check occurring while in guest, KVM layer tries recovery
@ 2015-03-17 9:27 Mahesh J Salgaonkar
2015-03-23 3:32 ` Paul Mackerras
0 siblings, 1 reply; 2+ messages in thread
From: Mahesh J Salgaonkar @ 2015-03-17 9:27 UTC (permalink / raw)
To: linuxppc-dev, Paul Mackerras, Michael Ellerman
Cc: David Gibson, Alexander Graf
From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
and deliver MCE to guest if recovery is failed. For recovered errors
we just go back to normal functioning of guest. But there are cases
where we may hit MCE in guest with MSR(RI=0), which means MCE interrupt is
not recoverable and guest can not function normally it should go down to
panic path. The current implementation does not have check for MSR(RI=0)
which can cause guest to crash with Bad kernel stack pointer instead of
machine check oops message.
[26281.490060] Bad kernel stack pointer 3fff9ccce5b0 at c00000000000490c
[26281.490434] Oops: Bad kernel stack pointer, sig: 6 [#1]
[26281.490472] SMP NR_CPUS=2048 NUMA pSeries
This patch fixes this issue by checking MSR(RI=0) in KVM layer and forwarding
unrecoverable interrupt to guest which then panics with proper machine check
Oops message.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bb94e6f..258f46d 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2063,7 +2063,6 @@ machine_check_realmode:
mr r3, r9 /* get vcpu pointer */
bl kvmppc_realmode_machine_check
nop
- cmpdi r3, 0 /* Did we handle MCE ? */
ld r9, HSTATE_KVM_VCPU(r13)
li r12, BOOK3S_INTERRUPT_MACHINE_CHECK
/*
@@ -2076,13 +2075,18 @@ machine_check_realmode:
* The old code used to return to host for unhandled errors which
* was causing guest to hang with soft lockups inside guest and
* makes it difficult to recover guest instance.
+ *
+ * if we receive machine check with MSR(RI=0) then deliver it to
+ * guest as machine check causing guest to crash.
*/
- ld r10, VCPU_PC(r9)
ld r11, VCPU_MSR(r9)
+ andi. r10, r11, MSR_RI /* check for unrecoverable exception */
+ beq 1f /* Deliver a machine check to guest */
+ ld r10, VCPU_PC(r9)
+ cmpdi r3, 0 /* Did we handle MCE ? */
bne 2f /* Continue guest execution. */
/* If not, deliver a machine check. SRR0/1 are already set */
- li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
- ld r11, VCPU_MSR(r9)
+1: li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
bl kvmppc_msr_interrupt
2: b fast_interrupt_c_return
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [RESEND PATCH] For machine check occurring while in guest, KVM layer tries recovery
2015-03-17 9:27 [RESEND PATCH] For machine check occurring while in guest, KVM layer tries recovery Mahesh J Salgaonkar
@ 2015-03-23 3:32 ` Paul Mackerras
0 siblings, 0 replies; 2+ messages in thread
From: Paul Mackerras @ 2015-03-23 3:32 UTC (permalink / raw)
To: Mahesh J Salgaonkar; +Cc: linuxppc-dev, David Gibson, Alexander Graf
On Tue, Mar 17, 2015 at 02:57:48PM +0530, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> and deliver MCE to guest if recovery is failed. For recovered errors
> we just go back to normal functioning of guest. But there are cases
> where we may hit MCE in guest with MSR(RI=0), which means MCE interrupt is
> not recoverable and guest can not function normally it should go down to
> panic path. The current implementation does not have check for MSR(RI=0)
> which can cause guest to crash with Bad kernel stack pointer instead of
> machine check oops message.
>
> [26281.490060] Bad kernel stack pointer 3fff9ccce5b0 at c00000000000490c
> [26281.490434] Oops: Bad kernel stack pointer, sig: 6 [#1]
> [26281.490472] SMP NR_CPUS=2048 NUMA pSeries
>
> This patch fixes this issue by checking MSR(RI=0) in KVM layer and forwarding
> unrecoverable interrupt to guest which then panics with proper machine check
> Oops message.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
The patch itself is fine, but you need a proper headline (something
like "KVM: PPC: Book3S HV: Inform guest of unrecoverable machine
checks" perhaps) as the subject of the email, and you need to post the
patch to both the kvm@vger.kernel.org list and the
kvm-ppc@vger.kernel.org list. Also, the English in the patch
description could use some improvement.
Acked-by: Paul Mackerras <paulus@samba.org>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-03-23 3:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-17 9:27 [RESEND PATCH] For machine check occurring while in guest, KVM layer tries recovery Mahesh J Salgaonkar
2015-03-23 3:32 ` Paul Mackerras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).