From: Daniel Axtens <dja@axtens.net>
To: Aravinda Prasad <aravinda@linux.vnet.ibm.com>,
paulus@ozlabs.org, linuxppc-dev@ozlabs.org,
kvm-ppc@vger.kernel.org
Cc: mahesh@linux.vnet.ibm.com, david@gibson.dropbear.id.au,
agraf@suse.de, michaele@au1.ibm.com, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
Date: Thu, 12 Nov 2015 13:24:19 +1100 [thread overview]
Message-ID: <876118ymy4.fsf@gamma.ozlabs.ibm.com> (raw)
In-Reply-To: <20151111165845.3721.98296.stgit@aravindap>
[-- Attachment #1: Type: text/plain, Size: 5571 bytes --]
Aravinda Prasad <aravinda@linux.vnet.ibm.com> writes:
> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
>
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
>
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
I've poked at the MCE code, but not the KVM MCE code, so I may be
mistaken here, but I'm not clear on how this handles errors that the
guest can recover without terminating.
For example, a Linux guest can handle a UE in guest userspace by killing
the guest process. A hypthetical non-linux guest with a microkernel
could even survive UEs in drivers.
It sounds from your patch like you're changing this behaviour. Is this
right?
Regards,
Daniel
>
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv.c | 12 +++---------
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 31 +++++++++++--------------------
> 2 files changed, 14 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2280497..1b1dff0 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
> r = RESUME_GUEST;
> break;
> case BOOK3S_INTERRUPT_MACHINE_CHECK:
> - /*
> - * Deliver a machine check interrupt to the guest.
> - * We have to do this, even if the host has handled the
> - * machine check, because machine checks use SRR0/1 and
> - * the interrupt might have trashed guest state in them.
> - */
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
> break;
> case BOOK3S_INTERRUPT_PROGRAM:
> {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..672b4f6 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> addi r1, r1, 112
> ld r7, HSTATE_HOST_MSR(r13)
>
> - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
> cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL
> beq 11f
> cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI
> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> mtmsrd r6, 1 /* Clear RI in MSR */
> mtsrr0 r8
> mtsrr1 r7
> - beq cr1, 13f /* machine check */
> RFI
>
> /* On POWER7, we have external interrupts set to use HSRR0/1 */
> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> mtspr SPRN_HSRR1, r7
> ba 0x500
>
> -13: b machine_check_fwnmi
> -
> 14: mtspr SPRN_HSRR0, r8
> mtspr SPRN_HSRR1, r7
> b hmi_exception_after_realmode
> @@ -2381,24 +2377,19 @@ machine_check_realmode:
> ld r9, HSTATE_KVM_VCPU(r13)
> li r12, BOOK3S_INTERRUPT_MACHINE_CHECK
> /*
> - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
> - * machine check interrupt (set HSRR0 to 0x200). And for handled
> - * errors (no-fatal), just go back to guest execution with current
> - * HSRR0 instead of exiting guest. This new approach will inject
> - * machine check to guest for fatal error causing guest to crash.
> - *
> - * The old code used to return to host for unhandled errors which
> - * was causing guest to hang with soft lockups inside guest and
> - * makes it difficult to recover guest instance.
> + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest
> + * by exiting the guest with KVM_EXIT_NMI exit reason (exit
> + * reason set later based on trap). For handled errors
> + * (no-fatal), go back to guest execution with current HSRR0
> + * instead of exiting the guest. This approach will cause
> + * the guest to exit in case of fatal machine check error.
> */
> - ld r10, VCPU_PC(r9)
> + bne 2f /* Continue guest execution? */
> + /* If not, exit the guest. SRR0/1 are already set */
> + b mc_cont
> +2: ld r10, VCPU_PC(r9)
> ld r11, VCPU_MSR(r9)
> - bne 2f /* Continue guest execution. */
> - /* If not, deliver a machine check. SRR0/1 are already set */
> - li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
> - ld r11, VCPU_MSR(r9)
> - bl kvmppc_msr_interrupt
> -2: b fast_interrupt_c_return
> + b fast_interrupt_c_return
>
> /*
> * Check the reason we woke from nap, and take appropriate action.
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 859 bytes --]
next prev parent reply other threads:[~2015-11-12 2:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-11 16:58 [PATCH] KVM: PPC: Exit guest upon fatal machine check exception Aravinda Prasad
2015-11-12 2:24 ` Daniel Axtens [this message]
2015-11-12 3:38 ` David Gibson
2015-11-12 4:32 ` Aravinda Prasad
2015-11-12 4:43 ` David Gibson
2015-11-12 17:52 ` Aravinda Prasad
2015-11-13 1:50 ` David Gibson
2015-11-13 6:26 ` Aravinda Prasad
2015-11-13 7:38 ` Thomas Huth
2015-11-13 11:25 ` Aravinda Prasad
2015-11-12 4:58 ` Daniel Axtens
2015-11-12 17:22 ` Aravinda Prasad
2015-11-12 21:37 ` Daniel Axtens
2015-11-13 4:58 ` Aravinda Prasad
2015-11-12 3:34 ` David Gibson
2015-11-12 5:18 ` Aravinda Prasad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=876118ymy4.fsf@gamma.ozlabs.ibm.com \
--to=dja@axtens.net \
--cc=agraf@suse.de \
--cc=aravinda@linux.vnet.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mahesh@linux.vnet.ibm.com \
--cc=michaele@au1.ibm.com \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).