From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=49216 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PnRoA-0007Wj-N1 for qemu-devel@nongnu.org; Thu, 10 Feb 2011 03:23:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PnRo4-0004DO-SD for qemu-devel@nongnu.org; Thu, 10 Feb 2011 03:23:18 -0500 Received: from fmmailgate01.web.de ([217.72.192.221]:55713) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PnRo4-0004Cj-Jg for qemu-devel@nongnu.org; Thu, 10 Feb 2011 03:23:16 -0500 Message-ID: <4D53A032.9000702@web.de> Date: Thu, 10 Feb 2011 09:22:10 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <1297220431.5180.15.camel@yhuang-dev> <4D52498D.9060706@web.de> <1297297678.17407.3.camel@yhuang-dev> In-Reply-To: <1297297678.17407.3.camel@yhuang-dev> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig297243E7C850959C9C61C1CD" Sender: jan.kiszka@web.de Subject: [Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Huang Ying Cc: "kvm@vger.kernel.org" , Dean Nelson , Marcelo Tosatti , "qemu-devel@nongnu.org" , Anthony Liguori , Andi Kleen , Avi Kivity This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig297243E7C850959C9C61C1CD Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 2011-02-10 01:27, Huang Ying wrote: > On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote: >> On 2011-02-09 04:00, Huang Ying wrote: >>> In Linux kernel HWPoison processing implementation, the virtual >>> address in processes mapping the error physical memory page is marked= >>> as HWPoison. So that, the further accessing to the virtual >>> address will kill corresponding processes with SIGBUS. >>> >>> If the error physical memory page is used by a KVM guest, the SIGBUS >>> will be sent to QEMU, and QEMU will simulate a MCE to report that >>> memory error to the guest OS. If the guest OS can not recover from >>> the error (for example, the page is accessed by kernel code), guest O= S >>> will reboot the system. But because the underlying host virtual >>> address backing the guest physical memory is still poisoned, if the >>> guest system accesses the corresponding guest physical memory even >>> after rebooting, the SIGBUS will still be sent to QEMU and MCE will b= e >>> simulated. That is, guest system can not recover via rebooting. >> >> Yeah, saw this already during my test... >> >>> >>> In fact, across rebooting, the contents of guest physical memory page= >>> need not to be kept. We can allocate a new host physical page to >>> back the corresponding guest physical address. >> >> I just wondering what would be architecturally suboptimal if we simply= >> remapped on SIGBUS directly. Would save us at least the bookkeeping. >=20 > Because we can not change the content of memory silently during guest O= S > running, this may corrupts guest OS data structure and even ruins disk > contents. But during rebooting, all guest OS state are discarded. I was not talking about remapping more than just the pages that became inaccessible, just like you do now. But I guess the problem is rather that insane guests continuing to access those pages before reboot should also still receive MCEs. Jan --------------enig297243E7C850959C9C61C1CD Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk1ToEkACgkQitSsb3rl5xSVRgCgmraoybXHM34Oej0tjxv1XORF F7UAnj2XAxChJ55jTdpVW2NqZoM2wBlb =hWHD -----END PGP SIGNATURE----- --------------enig297243E7C850959C9C61C1CD--