From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Ellerman Subject: Re: [v2,1/2] KVM: PPC: Book3S HV: Simplify machine check handling Date: Fri, 22 Feb 2019 20:48:07 +1100 (AEDT) Message-ID: <445RNR3184z9sP2@ozlabs.org> References: <20190221023849.7zra6dhii6fele6i@oak.ozlabs.ibm.com> To: Paul Mackerras , linuxppc-dev@ozlabs.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org Return-path: In-Reply-To: <20190221023849.7zra6dhii6fele6i@oak.ozlabs.ibm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" List-Id: kvm.vger.kernel.org On Thu, 2019-02-21 at 02:38:49 UTC, Paul Mackerras wrote: > This makes the handling of machine check interrupts that occur inside > a guest simpler and more robust, with less done in assembler code and > in real mode. > > Now, when a machine check occurs inside a guest, we always get the > machine check event struct and put a copy in the vcpu struct for the > vcpu where the machine check occurred. We no longer call > machine_check_queue_event() from kvmppc_realmode_mc_power7(), because > on POWER8, when a vcpu is running on an offline secondary thread and > we call machine_check_queue_event(), that calls irq_work_queue(), > which doesn't work because the CPU is offline, but instead triggers > the WARN_ON(lazy_irq_pending()) in pnv_smp_cpu_kill_self() (which > fires again and again because nothing clears the condition). > > All that machine_check_queue_event() actually does is to cause the > event to be printed to the console. For a machine check occurring in > the guest, we now print the event in kvmppc_handle_exit_hv() > instead. > > The assembly code at label machine_check_realmode now just calls C > code and then continues exiting the guest. We no longer either > synthesize a machine check for the guest in assembly code or return > to the guest without a machine check. > > The code in kvmppc_handle_exit_hv() is extended to handle the case > where the guest is not FWNMI-capable. In that case we now always > synthesize a machine check interrupt for the guest. Previously, if > the host thinks it has recovered the machine check fully, it would > return to the guest without any notification that the machine check > had occurred. If the machine check was caused by some action of the > guest (such as creating duplicate SLB entries), it is much better to > tell the guest that it has caused a problem. Therefore we now always > generate a machine check interrupt for guests that are not > FWNMI-capable. > > Reviewed-by: Aravinda Prasad > Reviewed-by: Mahesh Salgaonkar > Signed-off-by: Paul Mackerras Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/884dfb722db899e36d8c382783347aab cheers