From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dean Nelson Subject: Re: [patch uq/master 7/8] MCE: Relay UCR MCE to guest Date: Fri, 08 Oct 2010 07:02:47 -0500 Message-ID: <4CAF0867.2060204@redhat.com> References: <20101004185447.891324545@redhat.com> <20101004185715.167557459@redhat.com> <4CABD7CC.6030909@jp.fujitsu.com> <20101006160531.GB4277@amt.cnet> <4CACBB94.10200@redhat.com> <4CAD417B.7060808@jp.fujitsu.com> <1286507754.7768.66.camel@yhuang-dev> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Hidetoshi Seto , Marcelo Tosatti , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" To: Huang Ying Return-path: Received: from mx1.redhat.com ([209.132.183.28]:36098 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753211Ab0JHMCx (ORCPT ); Fri, 8 Oct 2010 08:02:53 -0400 In-Reply-To: <1286507754.7768.66.camel@yhuang-dev> Sender: kvm-owner@vger.kernel.org List-ID: On 10/07/2010 10:15 PM, Huang Ying wrote: > Hi, Seto, > > On Thu, 2010-10-07 at 11:41 +0800, Hidetoshi Seto wrote: >> (2010/10/07 3:10), Dean Nelson wrote: >>> When I applied a patch to the guest's kernel which forces mce_ser to be >>> set, as if MCG_SER_P was set (see __mcheck_cpu_cap_init()), I found >>> that when the memory page was 'owned' by a guest process, the process >>> would be killed (if the page was dirty), and the guest would stay >>> running. The HWPoisoned page would be sidelined and not cause any more >>> issues. >> >> Excellent. >> So while guest kernel knows which page is poisoned, guest processes >> are controlled not to touch the page. >> >> ... Therefore rebooting the vm and renewing kernel will lost the >> information where is poisoned. > > Yes. That is an issue. Dean suggests that make qemu-kvm to refuse reboot > the guest if there is poisoned page and ask for user to intervention. I > have another idea to replace the poison pages with good pages when > reboot, that is, recover without user intervention. Hi, Huang, I much prefer the replacing of the poisoned pages with good pages on reboot, over the refusing to reboot. So definitely go with your idea. Thanks, Dean