From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=35462 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P4Bf4-0000iU-W2 for qemu-devel@nongnu.org; Fri, 08 Oct 2010 08:02:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P4Bf3-0002rB-I2 for qemu-devel@nongnu.org; Fri, 08 Oct 2010 08:02:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40522) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P4Bf3-0002r4-Bs for qemu-devel@nongnu.org; Fri, 08 Oct 2010 08:02:53 -0400 Message-ID: <4CAF0867.2060204@redhat.com> Date: Fri, 08 Oct 2010 07:02:47 -0500 From: Dean Nelson MIME-Version: 1.0 References: <20101004185447.891324545@redhat.com> <20101004185715.167557459@redhat.com> <4CABD7CC.6030909@jp.fujitsu.com> <20101006160531.GB4277@amt.cnet> <4CACBB94.10200@redhat.com> <4CAD417B.7060808@jp.fujitsu.com> <1286507754.7768.66.camel@yhuang-dev> In-Reply-To: <1286507754.7768.66.camel@yhuang-dev> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [patch uq/master 7/8] MCE: Relay UCR MCE to guest List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Huang Ying Cc: Hidetoshi Seto , Marcelo Tosatti , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" On 10/07/2010 10:15 PM, Huang Ying wrote: > Hi, Seto, > > On Thu, 2010-10-07 at 11:41 +0800, Hidetoshi Seto wrote: >> (2010/10/07 3:10), Dean Nelson wrote: >>> When I applied a patch to the guest's kernel which forces mce_ser to be >>> set, as if MCG_SER_P was set (see __mcheck_cpu_cap_init()), I found >>> that when the memory page was 'owned' by a guest process, the process >>> would be killed (if the page was dirty), and the guest would stay >>> running. The HWPoisoned page would be sidelined and not cause any more >>> issues. >> >> Excellent. >> So while guest kernel knows which page is poisoned, guest processes >> are controlled not to touch the page. >> >> ... Therefore rebooting the vm and renewing kernel will lost the >> information where is poisoned. > > Yes. That is an issue. Dean suggests that make qemu-kvm to refuse reboot > the guest if there is poisoned page and ask for user to intervention. I > have another idea to replace the poison pages with good pages when > reboot, that is, recover without user intervention. Hi, Huang, I much prefer the replacing of the poisoned pages with good pages on reboot, over the refusing to reboot. So definitely go with your idea. Thanks, Dean