From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Subject: Re: [PATCH v2] KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking Date: Tue, 27 Nov 2012 00:20:08 +0100 Message-ID: References: <20121122092442.GA31117@bloggs.ozlabs.ibm.com> <20121122092555.GB31117@bloggs.ozlabs.ibm.com> <5E924DFD-5B00-4B3C-9933-575665FBD8B4@suse.de> <20121124083750.GD23537@bloggs.ozlabs.ibm.com> <2951D66E-5A98-455A-92FD-808C2083987E@suse.de> <20121126231813.GE3370@bloggs.ozlabs.ibm.com> Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org To: Paul Mackerras Return-path: In-Reply-To: <20121126231813.GE3370@bloggs.ozlabs.ibm.com> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 27.11.2012, at 00:18, Paul Mackerras wrote: > On Tue, Nov 27, 2012 at 12:16:28AM +0100, Alexander Graf wrote: >> >> On 24.11.2012, at 09:37, Paul Mackerras wrote: >> >>> Currently, if a machine check interrupt happens while we are in the >>> guest, we exit the guest and call the host's machine check handler, >>> which tends to cause the host to panic. Some machine checks can be >>> triggered by the guest; for example, if the guest creates two entries >>> in the SLB that map the same effective address, and then accesses that >>> effective address, the CPU will take a machine check interrupt. >>> >>> To handle this better, when a machine check happens inside the guest, >>> we call a new function, kvmppc_realmode_machine_check(), while still in >>> real mode before exiting the guest. On POWER7, it handles the cases >>> that the guest can trigger, either by flushing and reloading the SLB, >>> or by flushing the TLB, and then it delivers the machine check interrupt >>> directly to the guest without going back to the host. On POWER7, the >>> OPAL firmware patches the machine check interrupt vector so that it >>> gets control first, and it leaves behind its analysis of the situation >>> in a structure pointed to by the opal_mc_evt field of the paca. The >>> kvmppc_realmode_machine_check() function looks at this, and if OPAL >>> reports that there was no error, or that it has handled the error, we >>> also go straight back to the guest with a machine check. We have to >>> deliver a machine check to the guest since the machine check interrupt >>> might have trashed valid values in SRR0/1. >>> >>> If the machine check is one we can't handle in real mode, and one that >>> OPAL hasn't already handled, or on PPC970, we exit the guest and call >>> the host's machine check handler. We do this by jumping to the >>> machine_check_fwnmi label, rather than absolute address 0x200, because >>> we don't want to re-execute OPAL's handler on POWER7. On PPC970, the >>> two are equivalent because address 0x200 just contains a branch. >>> >>> Then, if the host machine check handler decides that the system can >>> continue executing, kvmppc_handle_exit() delivers a machine check >>> interrupt to the guest -- once again to let the guest know that SRR0/1 >>> have been modified. >>> >>> Signed-off-by: Paul Mackerras >> >> Thanks for the semantic explanations :). From that POV things are clear and good with me now. That leaves only checkpatch ;) >> >> >> WARNING: please, no space before tabs >> #142: FILE: arch/powerpc/kvm/book3s_hv_ras.c:21: >> +#define SRR1_MC_IFETCH_SLBMULTI ^I3^I/* SLB multi-hit */$ >> >> WARNING: please, no space before tabs >> #143: FILE: arch/powerpc/kvm/book3s_hv_ras.c:22: >> +#define SRR1_MC_IFETCH_SLBPARMULTI ^I4^I/* SLB parity + multi-hit */$ >> >> WARNING: min() should probably be min_t(u32, slb->persistent, SLB_MIN_SIZE) >> #168: FILE: arch/powerpc/kvm/book3s_hv_ras.c:47: >> + n = min(slb->persistent, (u32) SLB_MIN_SIZE); >> >> total: 0 errors, 3 warnings, 357 lines checked > > Phooey. Do you want me to resubmit the patch, or will you fix it up? Hrm. Promise to run checkpatch yourself next time and I'll fix it up for you this time ;) Alex