From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH] QEMU-KVM: MCE: Relay UCR MCE to guest Date: Tue, 08 Sep 2009 09:44:40 +0300 Message-ID: <4AA5FD58.3020506@redhat.com> References: <1252312353.14648.731.camel@yhuang-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Andi Kleen , Anthony Liguori , "kvm@vger.kernel.org" To: Huang Ying Return-path: Received: from mx1.redhat.com ([209.132.183.28]:38149 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753467AbZIHGi2 (ORCPT ); Tue, 8 Sep 2009 02:38:28 -0400 In-Reply-To: <1252312353.14648.731.camel@yhuang-dev.sh.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 09/07/2009 11:32 AM, Huang Ying wrote: > UCR (uncorrected recovery) MCE is supported in recent Intel CPUs, > where some hardware error such as some memory error can be reported > without PCC (processor context corrupted). To recover from such MCE, > the corresponding memory will be unmapped, and all processes accessing > the memory will be killed via SIGBUS. > > For KVM, if QEMU/KVM is killed, all guest processes will be killed > too. So we relay SIGBUS from host OS to guest system via a UCR MCE > injection. Then guest OS can isolate corresponding memory and kill > necessary guest processes only. SIGBUS sent to main thread (not VCPU > threads) will be broadcast to all VCPU threads as UCR MCE. > > Won't the guest be confused by the broadcast? How does real hardware work? > > +static void sigbus_handler(int n, struct signalfd_siginfo *siginfo, void *ctx) > +{ > + if (siginfo->ssi_code == BUS_MCEERR_AO) { > + uint64_t status; > + unsigned long paddr; > + CPUState *cenv; > + > + /* Hope we are lucky for AO MCE */ > + if (kvm_addr_userspace_to_phys((unsigned long)siginfo->ssi_addr, > +&paddr)) { > + fprintf(stderr, "Hardware memory error for memory used by " > + "QEMU itself instead of guest system!: %llx\n", > + (unsigned long long)siginfo->ssi_addr); > + return; > + } > + status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN > + | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S > + | 0xc0; > + kvm_inject_x86_mce(first_cpu, 9, status, > + MCG_STATUS_MCIP | MCG_STATUS_RIPV, paddr, > + (MCM_ADDR_PHYS<< 6) | 0xc); > This is a vcpu ioctl, yes? if so it must be called from the vcpu thread. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.