Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jan Kiszka <jan.kiszka@web.de>
To: Huang Ying <ying.huang@intel.com>
Cc: Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Anthony Liguori <aliguori@linux.vnet.ibm.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Dean Nelson <dnelson@redhat.com>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot
Date: Thu, 10 Feb 2011 09:22:10 +0100	[thread overview]
Message-ID: <4D53A032.9000702@web.de> (raw)
In-Reply-To: <1297297678.17407.3.camel@yhuang-dev>

[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]

On 2011-02-10 01:27, Huang Ying wrote:
> On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote:
>> On 2011-02-09 04:00, Huang Ying wrote:
>>> In Linux kernel HWPoison processing implementation, the virtual
>>> address in processes mapping the error physical memory page is marked
>>> as HWPoison.  So that, the further accessing to the virtual
>>> address will kill corresponding processes with SIGBUS.
>>>
>>> If the error physical memory page is used by a KVM guest, the SIGBUS
>>> will be sent to QEMU, and QEMU will simulate a MCE to report that
>>> memory error to the guest OS.  If the guest OS can not recover from
>>> the error (for example, the page is accessed by kernel code), guest OS
>>> will reboot the system.  But because the underlying host virtual
>>> address backing the guest physical memory is still poisoned, if the
>>> guest system accesses the corresponding guest physical memory even
>>> after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
>>> simulated.  That is, guest system can not recover via rebooting.
>>
>> Yeah, saw this already during my test...
>>
>>>
>>> In fact, across rebooting, the contents of guest physical memory page
>>> need not to be kept.  We can allocate a new host physical page to
>>> back the corresponding guest physical address.
>>
>> I just wondering what would be architecturally suboptimal if we simply
>> remapped on SIGBUS directly. Would save us at least the bookkeeping.
> 
> Because we can not change the content of memory silently during guest OS
> running, this may corrupts guest OS data structure and even ruins disk
> contents.  But during rebooting, all guest OS state are discarded.

I was not talking about remapping more than just the pages that became
inaccessible, just like you do now. But I guess the problem is rather
that insane guests continuing to access those pages before reboot should
also still receive MCEs.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

WARNING: multiple messages have this Message-ID (diff)

From: Jan Kiszka <jan.kiszka@web.de>
To: Huang Ying <ying.huang@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Dean Nelson <dnelson@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Anthony Liguori <aliguori@linux.vnet.ibm.com>,
	Andi Kleen <andi@firstfloor.org>, Avi Kivity <avi@redhat.com>
Subject: [Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot
Date: Thu, 10 Feb 2011 09:22:10 +0100	[thread overview]
Message-ID: <4D53A032.9000702@web.de> (raw)
In-Reply-To: <1297297678.17407.3.camel@yhuang-dev>

[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]

On 2011-02-10 01:27, Huang Ying wrote:
> On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote:
>> On 2011-02-09 04:00, Huang Ying wrote:
>>> In Linux kernel HWPoison processing implementation, the virtual
>>> address in processes mapping the error physical memory page is marked
>>> as HWPoison.  So that, the further accessing to the virtual
>>> address will kill corresponding processes with SIGBUS.
>>>
>>> If the error physical memory page is used by a KVM guest, the SIGBUS
>>> will be sent to QEMU, and QEMU will simulate a MCE to report that
>>> memory error to the guest OS.  If the guest OS can not recover from
>>> the error (for example, the page is accessed by kernel code), guest OS
>>> will reboot the system.  But because the underlying host virtual
>>> address backing the guest physical memory is still poisoned, if the
>>> guest system accesses the corresponding guest physical memory even
>>> after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
>>> simulated.  That is, guest system can not recover via rebooting.
>>
>> Yeah, saw this already during my test...
>>
>>>
>>> In fact, across rebooting, the contents of guest physical memory page
>>> need not to be kept.  We can allocate a new host physical page to
>>> back the corresponding guest physical address.
>>
>> I just wondering what would be architecturally suboptimal if we simply
>> remapped on SIGBUS directly. Would save us at least the bookkeeping.
> 
> Because we can not change the content of memory silently during guest OS
> running, this may corrupts guest OS data structure and even ruins disk
> contents.  But during rebooting, all guest OS state are discarded.

I was not talking about remapping more than just the pages that became
inaccessible, just like you do now. But I guess the problem is rather
that insane guests continuing to access those pages before reboot should
also still receive MCEs.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

next prev parent reply	other threads:[~2011-02-10  8:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-09  3:00 [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot Huang Ying
2011-02-09  3:00 ` [Qemu-devel] " Huang Ying
2011-02-09  8:00 ` Jan Kiszka
2011-02-09  8:00   ` [Qemu-devel] " Jan Kiszka
2011-02-10  0:27   ` Huang Ying
2011-02-10  0:27     ` [Qemu-devel] " Huang Ying
2011-02-10  8:22     ` Jan Kiszka [this message]
2011-02-10  8:22       ` Jan Kiszka
2011-02-10  8:52     ` Jan Kiszka
2011-02-10  8:52       ` [Qemu-devel] " Jan Kiszka
2011-02-11  1:20       ` Huang Ying
2011-02-11  1:20         ` [Qemu-devel] " Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D53A032.9000702@web.de \
    --to=jan.kiszka@web.de \
    --cc=aliguori@linux.vnet.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=dnelson@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.