* Re: [PATCH 4/5] Xen/MCE: Abort live migration when vMCE occur
2012-10-22 11:32 ` George Dunlap
@ 2012-10-24 14:30 ` Liu, Jinsong
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
2012-10-29 15:22 ` [PATCH 5/5] Xen/MCE: handle broken page occurs during migration Liu, Jinsong
2 siblings, 0 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-24 14:30 UTC (permalink / raw)
To: George Dunlap
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Ian Jackson, Jan Beulich
George Dunlap wrote:
> On 19/10/12 21:32, Liu, Jinsong wrote:
>>> Wouldn't your patch 5 be sufficient to deal with this case? It
>>> seems like the broken page would get marked as such, and then get
>>> marked broken on the receiving side, wouldn't it?
>>>
>>> -George
>> Seems no, patch 4 is to handle the case mce occur during migration
>> --> under such case the broken page would mapped (at that time the
>> page is a good page) and copy to target; While patch 5 is to handle
>> the case mce occur beofre migration --> under such case the broken
>> page would not mapped and so would not copy to target.
>
> In the "during migration", there are actually two cases to consider:
> 1. The page breaks before the domain save code maps it.
> 2. The page breaks after the domain save code has mapped it once
>
> Patch 5 will detect a broken page when it tries to map it, and send it
> as type "broken", without data.
>
> So in the case of #1, it will be taken care of by patch 5 without any
> changes.
Yes, exactly.
>
> In the case of #2, it seems like we could probably modify patch 5 to
> handle it. If we mark a page dirty, then the domain save code will
> try to send it again. When it tries to map it, it will discover that
> the page has been marked "broken", and will send it as a "broken"
> page, without data. As long as the domain restore code marks the
> already-received page as "broken" when it receives this message, then
> everything should work as normal.
>
> What do you think?
>
> -George
Yep, sounds perfect! will update & test later.
Thanks,
Jinsong
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-22 11:32 ` George Dunlap
2012-10-24 14:30 ` Liu, Jinsong
@ 2012-10-29 15:21 ` Liu, Jinsong
2012-10-29 16:35 ` Jan Beulich
` (3 more replies)
2012-10-29 15:22 ` [PATCH 5/5] Xen/MCE: handle broken page occurs during migration Liu, Jinsong
2 siblings, 4 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-29 15:21 UTC (permalink / raw)
To: George Dunlap, Ian Jackson
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Jan Beulich
[-- Attachment #1: Type: text/plain, Size: 8706 bytes --]
X86/vMCE: handle broken page occurred before migration
This patch handles guest broken page which occur before migration.
At sender, the broken page would be mapped but not copied to target
(otherwise it may trigger more serious error, say, SRAR error).
While its pfn_type and pfn number would be transferred to target
so that target take appropriate action.
At target, it would set p2m as p2m_ram_broken for broken page, so that
if guest access the broken page again, it would kill itself as expected.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
diff -r e27a6d53ac15 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain.c Thu Oct 25 05:49:10 2012 +0800
@@ -283,6 +283,22 @@
return ret;
}
+/* set broken page p2m */
+int xc_set_broken_page_p2m(xc_interface *xch,
+ uint32_t domid,
+ unsigned long pfn)
+{
+ int ret;
+ DECLARE_DOMCTL;
+
+ domctl.cmd = XEN_DOMCTL_set_broken_page_p2m;
+ domctl.domain = (domid_t)domid;
+ domctl.u.set_broken_page_p2m.pfn = pfn;
+ ret = do_domctl(xch, &domctl);
+
+ return ret ? -1 : 0;
+}
+
/* get info from hvm guest for save */
int xc_domain_hvm_getcontext(xc_interface *xch,
uint32_t domid,
diff -r e27a6d53ac15 tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain_restore.c Thu Oct 25 05:49:10 2012 +0800
@@ -962,9 +962,15 @@
countpages = count;
for (i = oldcount; i < buf->nr_pages; ++i)
- if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB
- ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XALLOC)
+ {
+ unsigned long pagetype;
+
+ pagetype = buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK;
+ if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ||
+ pagetype == XEN_DOMCTL_PFINFO_BROKEN ||
+ pagetype == XEN_DOMCTL_PFINFO_XALLOC )
--countpages;
+ }
if (!countpages)
return count;
@@ -1200,6 +1206,17 @@
/* a bogus/unmapped/allocate-only page: skip it */
continue;
+ if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN )
+ {
+ if ( xc_set_broken_page_p2m(xch, dom, pfn) )
+ {
+ ERROR("Set p2m for broken page failed, "
+ "dom=%d, pfn=%lx\n", dom, pfn);
+ goto err_mapped;
+ }
+ continue;
+ }
+
if (pfn_err[i])
{
ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx p2m_mfn %lx",
diff -r e27a6d53ac15 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:10 2012 +0800
@@ -1277,6 +1277,13 @@
if ( !hvm )
gmfn = pfn_to_mfn(gmfn);
+ if ( pfn_type[j] == XEN_DOMCTL_PFINFO_BROKEN )
+ {
+ pfn_type[j] |= pfn_batch[j];
+ ++run;
+ continue;
+ }
+
if ( pfn_err[j] )
{
if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB )
@@ -1371,8 +1378,12 @@
}
}
- /* skip pages that aren't present or are alloc-only */
+ /*
+ * skip pages that aren't present,
+ * or are broken, or are alloc-only
+ */
if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
+ || pagetype == XEN_DOMCTL_PFINFO_BROKEN
|| pagetype == XEN_DOMCTL_PFINFO_XALLOC )
continue;
diff -r e27a6d53ac15 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xenctrl.h Thu Oct 25 05:49:10 2012 +0800
@@ -575,6 +575,17 @@
xc_domaininfo_t *info);
/**
+ * This function set p2m for broken page
+ * &parm xch a handle to an open hypervisor interface
+ * @parm domid the domain id which broken page belong to
+ * @parm pfn the pfn number of the broken page
+ * @return 0 on success, -1 on failure
+ */
+int xc_set_broken_page_p2m(xc_interface *xch,
+ uint32_t domid,
+ unsigned long pfn);
+
+/**
* This function returns information about the context of a hvm domain
* @parm xch a handle to an open hypervisor interface
* @parm domid the domain to get information from
diff -r e27a6d53ac15 xen/arch/x86/domctl.c
--- a/xen/arch/x86/domctl.c Thu Oct 11 01:52:33 2012 +0800
+++ b/xen/arch/x86/domctl.c Thu Oct 25 05:49:10 2012 +0800
@@ -209,12 +209,18 @@
for ( j = 0; j < k; j++ )
{
unsigned long type = 0;
+ p2m_type_t t;
- page = get_page_from_gfn(d, arr[j], NULL, P2M_ALLOC);
+ page = get_page_from_gfn(d, arr[j], &t, P2M_ALLOC);
if ( unlikely(!page) ||
unlikely(is_xen_heap_page(page)) )
- type = XEN_DOMCTL_PFINFO_XTAB;
+ {
+ if ( p2m_is_broken(t) )
+ type = XEN_DOMCTL_PFINFO_BROKEN;
+ else
+ type = XEN_DOMCTL_PFINFO_XTAB;
+ }
else
{
switch( page->u.inuse.type_info & PGT_type_mask )
@@ -235,6 +241,9 @@
if ( page->u.inuse.type_info & PGT_pinned )
type |= XEN_DOMCTL_PFINFO_LPINTAB;
+
+ if ( page->count_info & PGC_broken )
+ type = XEN_DOMCTL_PFINFO_BROKEN;
}
if ( page )
@@ -1568,6 +1577,28 @@
}
break;
+ case XEN_DOMCTL_set_broken_page_p2m:
+ {
+ struct domain *d;
+ p2m_type_t pt;
+ unsigned long pfn;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL )
+ {
+ pfn = domctl->u.set_broken_page_p2m.pfn;
+
+ get_gfn_query(d, pfn, &pt);
+ p2m_change_type(d, pfn, pt, p2m_ram_broken);
+ put_gfn(d, pfn);
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
default:
ret = iommu_do_domctl(domctl, u_domctl);
break;
diff -r e27a6d53ac15 xen/include/public/domctl.h
--- a/xen/include/public/domctl.h Thu Oct 11 01:52:33 2012 +0800
+++ b/xen/include/public/domctl.h Thu Oct 25 05:49:10 2012 +0800
@@ -136,6 +136,7 @@
#define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31)
#define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid page */
#define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /* allocate-only page */
+#define XEN_DOMCTL_PFINFO_BROKEN (0xdU<<28) /* broken page */
#define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
#define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
@@ -835,6 +836,12 @@
typedef struct xen_domctl_set_access_required xen_domctl_set_access_required_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
+struct xen_domctl_set_broken_page_p2m {
+ uint64_aligned_t pfn;
+};
+typedef struct xen_domctl_set_broken_page_p2m xen_domctl_set_broken_page_p2m_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_broken_page_p2m_t);
+
struct xen_domctl {
uint32_t cmd;
#define XEN_DOMCTL_createdomain 1
@@ -900,6 +907,7 @@
#define XEN_DOMCTL_set_access_required 64
#define XEN_DOMCTL_audit_p2m 65
#define XEN_DOMCTL_set_virq_handler 66
+#define XEN_DOMCTL_set_broken_page_p2m 67
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
@@ -955,6 +963,7 @@
struct xen_domctl_audit_p2m audit_p2m;
struct xen_domctl_set_virq_handler set_virq_handler;
struct xen_domctl_gdbsx_memio gdbsx_guest_memio;
+ struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
uint8_t pad[128];
[-- Attachment #2: 4_vmce_before_migration.patch --]
[-- Type: application/octet-stream, Size: 8474 bytes --]
X86/vMCE: handle broken page occurred before migration
This patch handles guest broken page which occur before migration.
At sender, the broken page would be mapped but not copied to target
(otherwise it may trigger more serious error, say, SRAR error).
While its pfn_type and pfn number would be transferred to target
so that target take appropriate action.
At target, it would set p2m as p2m_ram_broken for broken page, so that
if guest access the broken page again, it would kill itself as expected.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
diff -r e27a6d53ac15 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain.c Thu Oct 25 05:49:10 2012 +0800
@@ -283,6 +283,22 @@
return ret;
}
+/* set broken page p2m */
+int xc_set_broken_page_p2m(xc_interface *xch,
+ uint32_t domid,
+ unsigned long pfn)
+{
+ int ret;
+ DECLARE_DOMCTL;
+
+ domctl.cmd = XEN_DOMCTL_set_broken_page_p2m;
+ domctl.domain = (domid_t)domid;
+ domctl.u.set_broken_page_p2m.pfn = pfn;
+ ret = do_domctl(xch, &domctl);
+
+ return ret ? -1 : 0;
+}
+
/* get info from hvm guest for save */
int xc_domain_hvm_getcontext(xc_interface *xch,
uint32_t domid,
diff -r e27a6d53ac15 tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain_restore.c Thu Oct 25 05:49:10 2012 +0800
@@ -962,9 +962,15 @@
countpages = count;
for (i = oldcount; i < buf->nr_pages; ++i)
- if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB
- ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XALLOC)
+ {
+ unsigned long pagetype;
+
+ pagetype = buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK;
+ if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ||
+ pagetype == XEN_DOMCTL_PFINFO_BROKEN ||
+ pagetype == XEN_DOMCTL_PFINFO_XALLOC )
--countpages;
+ }
if (!countpages)
return count;
@@ -1200,6 +1206,17 @@
/* a bogus/unmapped/allocate-only page: skip it */
continue;
+ if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN )
+ {
+ if ( xc_set_broken_page_p2m(xch, dom, pfn) )
+ {
+ ERROR("Set p2m for broken page failed, "
+ "dom=%d, pfn=%lx\n", dom, pfn);
+ goto err_mapped;
+ }
+ continue;
+ }
+
if (pfn_err[i])
{
ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx p2m_mfn %lx",
diff -r e27a6d53ac15 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:10 2012 +0800
@@ -1277,6 +1277,13 @@
if ( !hvm )
gmfn = pfn_to_mfn(gmfn);
+ if ( pfn_type[j] == XEN_DOMCTL_PFINFO_BROKEN )
+ {
+ pfn_type[j] |= pfn_batch[j];
+ ++run;
+ continue;
+ }
+
if ( pfn_err[j] )
{
if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB )
@@ -1371,8 +1378,12 @@
}
}
- /* skip pages that aren't present or are alloc-only */
+ /*
+ * skip pages that aren't present,
+ * or are broken, or are alloc-only
+ */
if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
+ || pagetype == XEN_DOMCTL_PFINFO_BROKEN
|| pagetype == XEN_DOMCTL_PFINFO_XALLOC )
continue;
diff -r e27a6d53ac15 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h Thu Oct 11 01:52:33 2012 +0800
+++ b/tools/libxc/xenctrl.h Thu Oct 25 05:49:10 2012 +0800
@@ -575,6 +575,17 @@
xc_domaininfo_t *info);
/**
+ * This function set p2m for broken page
+ * &parm xch a handle to an open hypervisor interface
+ * @parm domid the domain id which broken page belong to
+ * @parm pfn the pfn number of the broken page
+ * @return 0 on success, -1 on failure
+ */
+int xc_set_broken_page_p2m(xc_interface *xch,
+ uint32_t domid,
+ unsigned long pfn);
+
+/**
* This function returns information about the context of a hvm domain
* @parm xch a handle to an open hypervisor interface
* @parm domid the domain to get information from
diff -r e27a6d53ac15 xen/arch/x86/domctl.c
--- a/xen/arch/x86/domctl.c Thu Oct 11 01:52:33 2012 +0800
+++ b/xen/arch/x86/domctl.c Thu Oct 25 05:49:10 2012 +0800
@@ -209,12 +209,18 @@
for ( j = 0; j < k; j++ )
{
unsigned long type = 0;
+ p2m_type_t t;
- page = get_page_from_gfn(d, arr[j], NULL, P2M_ALLOC);
+ page = get_page_from_gfn(d, arr[j], &t, P2M_ALLOC);
if ( unlikely(!page) ||
unlikely(is_xen_heap_page(page)) )
- type = XEN_DOMCTL_PFINFO_XTAB;
+ {
+ if ( p2m_is_broken(t) )
+ type = XEN_DOMCTL_PFINFO_BROKEN;
+ else
+ type = XEN_DOMCTL_PFINFO_XTAB;
+ }
else
{
switch( page->u.inuse.type_info & PGT_type_mask )
@@ -235,6 +241,9 @@
if ( page->u.inuse.type_info & PGT_pinned )
type |= XEN_DOMCTL_PFINFO_LPINTAB;
+
+ if ( page->count_info & PGC_broken )
+ type = XEN_DOMCTL_PFINFO_BROKEN;
}
if ( page )
@@ -1568,6 +1577,28 @@
}
break;
+ case XEN_DOMCTL_set_broken_page_p2m:
+ {
+ struct domain *d;
+ p2m_type_t pt;
+ unsigned long pfn;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL )
+ {
+ pfn = domctl->u.set_broken_page_p2m.pfn;
+
+ get_gfn_query(d, pfn, &pt);
+ p2m_change_type(d, pfn, pt, p2m_ram_broken);
+ put_gfn(d, pfn);
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
default:
ret = iommu_do_domctl(domctl, u_domctl);
break;
diff -r e27a6d53ac15 xen/include/public/domctl.h
--- a/xen/include/public/domctl.h Thu Oct 11 01:52:33 2012 +0800
+++ b/xen/include/public/domctl.h Thu Oct 25 05:49:10 2012 +0800
@@ -136,6 +136,7 @@
#define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31)
#define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid page */
#define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /* allocate-only page */
+#define XEN_DOMCTL_PFINFO_BROKEN (0xdU<<28) /* broken page */
#define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
#define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
@@ -835,6 +836,12 @@
typedef struct xen_domctl_set_access_required xen_domctl_set_access_required_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
+struct xen_domctl_set_broken_page_p2m {
+ uint64_aligned_t pfn;
+};
+typedef struct xen_domctl_set_broken_page_p2m xen_domctl_set_broken_page_p2m_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_broken_page_p2m_t);
+
struct xen_domctl {
uint32_t cmd;
#define XEN_DOMCTL_createdomain 1
@@ -900,6 +907,7 @@
#define XEN_DOMCTL_set_access_required 64
#define XEN_DOMCTL_audit_p2m 65
#define XEN_DOMCTL_set_virq_handler 66
+#define XEN_DOMCTL_set_broken_page_p2m 67
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
@@ -955,6 +963,7 @@
struct xen_domctl_audit_p2m audit_p2m;
struct xen_domctl_set_virq_handler set_virq_handler;
struct xen_domctl_gdbsx_memio gdbsx_guest_memio;
+ struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
uint8_t pad[128];
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
@ 2012-10-29 16:35 ` Jan Beulich
2012-10-29 17:19 ` Liu, Jinsong
2012-10-30 9:02 ` Jan Beulich
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2012-10-29 16:35 UTC (permalink / raw)
To: Jinsong Liu
Cc: Christoph Egger, Keir (Xen.org), Ian Campbell, George Dunlap,
Ian Jackson, xen-devel
>>> On 29.10.12 at 16:21, "Liu, Jinsong" <jinsong.liu@intel.com> wrote:
> X86/vMCE: handle broken page occurred before migration
>
> This patch handles guest broken page which occur before migration.
>
> At sender, the broken page would be mapped but not copied to target
> (otherwise it may trigger more serious error, say, SRAR error).
> While its pfn_type and pfn number would be transferred to target
> so that target take appropriate action.
>
> At target, it would set p2m as p2m_ram_broken for broken page, so that
> if guest access the broken page again, it would kill itself as expected.
>
> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
So I continue to be confused - wasn't the agreement you
reached with George that patch 5 re-done makes patch 4
unnecessary?
Jan
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-29 16:35 ` Jan Beulich
@ 2012-10-29 17:19 ` Liu, Jinsong
0 siblings, 0 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-29 17:19 UTC (permalink / raw)
To: Jan Beulich
Cc: Christoph Egger, Keir (Xen.org), Ian Campbell, George Dunlap,
Ian Jackson, xen-devel
Jan Beulich wrote:
>>>> On 29.10.12 at 16:21, "Liu, Jinsong" <jinsong.liu@intel.com> wrote:
>> X86/vMCE: handle broken page occurred before migration
>>
>> This patch handles guest broken page which occur before migration.
>>
>> At sender, the broken page would be mapped but not copied to target
>> (otherwise it may trigger more serious error, say, SRAR error).
>> While its pfn_type and pfn number would be transferred to target
>> so that target take appropriate action.
>>
>> At target, it would set p2m as p2m_ram_broken for broken page, so
>> that if guest access the broken page again, it would kill itself as
>> expected.
>>
>> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>
> So I continue to be confused - wasn't the agreement you
> reached with George that patch 5 re-done makes patch 4
> unnecessary?
>
No, the agreement is,
old patch 5 don't need re-do, it's OK to handle 'vmce occur before migration',
old patch 4 need update a little, it's used to handle 'vmce occur during migration', but need updated as 'not abort migration';
===============
BTW, this time I adjust the sequence of patch 4 and 5, since the new approach for 'vmce occur during migration' rely on some logic of 'vmce occur before migration'.
So latest patch 4 and 5 are:
patch 4 (same as old patch 5, no update): handle 'vmce occurred before migration';
patch 5 (updated old patch 4, according to George's suggestion): handle 'vmce occurs during migration' -- it updated a little for old patch 4 -- it didn't abort migration, instead it mark the broken page to dirty bitmap, so that at copypages stage of migration, the pfn_type and pfn number of broken page would transfer to target and then take appropriate action;
Thanks,
Jinsong
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
2012-10-29 16:35 ` Jan Beulich
@ 2012-10-30 9:02 ` Jan Beulich
2012-10-31 10:55 ` Liu, Jinsong
2012-10-30 9:25 ` George Dunlap
2012-10-30 9:27 ` George Dunlap
3 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2012-10-30 9:02 UTC (permalink / raw)
To: Jinsong Liu
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, George Dunlap, Ian Jackson
>>> On 29.10.12 at 16:21, "Liu, Jinsong" <jinsong.liu@intel.com> wrote:
> @@ -1568,6 +1577,28 @@
> }
> break;
>
> + case XEN_DOMCTL_set_broken_page_p2m:
> + {
> + struct domain *d;
> + p2m_type_t pt;
> + unsigned long pfn;
> +
> + d = rcu_lock_domain_by_id(domctl->domain);
> + if ( d != NULL )
> + {
> + pfn = domctl->u.set_broken_page_p2m.pfn;
> +
> + get_gfn_query(d, pfn, &pt);
Is it correct to ignore the return value here, and to act on any
value returned in "pt"?
> + p2m_change_type(d, pfn, pt, p2m_ram_broken);
What if the operation failed (i.e. you get back a type not
matching "pt")? This can happen because __get_gfn_type_access(),
other than what p2m_change_type() does, is not just a plain call
to p2m->get_entry().
Jan
> + put_gfn(d, pfn);
> +
> + rcu_unlock_domain(d);
> + }
> + else
> + ret = -ESRCH;
> + }
> + break;
> +
> default:
> ret = iommu_do_domctl(domctl, u_domctl);
> break;
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-30 9:02 ` Jan Beulich
@ 2012-10-31 10:55 ` Liu, Jinsong
0 siblings, 0 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-31 10:55 UTC (permalink / raw)
To: Jan Beulich
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, George Dunlap, Ian Jackson
Jan Beulich wrote:
>>>> On 29.10.12 at 16:21, "Liu, Jinsong" <jinsong.liu@intel.com> wrote:
>> @@ -1568,6 +1577,28 @@
>> }
>> break;
>>
>> + case XEN_DOMCTL_set_broken_page_p2m:
>> + {
>> + struct domain *d;
>> + p2m_type_t pt;
>> + unsigned long pfn;
>> +
>> + d = rcu_lock_domain_by_id(domctl->domain); + if ( d
>> != NULL ) + {
>> + pfn = domctl->u.set_broken_page_p2m.pfn; +
>> + get_gfn_query(d, pfn, &pt);
>
> Is it correct to ignore the return value here, and to act on any
> value returned in "pt"?
>
>> + p2m_change_type(d, pfn, pt, p2m_ram_broken);
>
> What if the operation failed (i.e. you get back a type not
> matching "pt")? This can happen because __get_gfn_type_access(),
> other than what p2m_change_type() does, is not just a plain call
> to p2m->get_entry().
>
Updated acordingly, add sanity check, will send out later.
Thanks,
Jinsong
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
2012-10-29 16:35 ` Jan Beulich
2012-10-30 9:02 ` Jan Beulich
@ 2012-10-30 9:25 ` George Dunlap
2012-10-30 9:27 ` George Dunlap
3 siblings, 0 replies; 23+ messages in thread
From: George Dunlap @ 2012-10-30 9:25 UTC (permalink / raw)
To: Liu, Jinsong
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Ian Jackson, Jan Beulich
Jinsong,
I'm at UDS now, but I'll try to review the new patches in the next few days.
If you end up sending these patches again, could you please send them in
a more normal "patchbomb-style" format? I.e., with a "00/02" header
describing what's new in the series, and then naming them 01/02 and
02/02 (instead of 4 and 5, when 1-3 have been applied for months)?
The easiest way to do this is to use hg's patchbomb extension; there's a
description of how to set it up here:
http://wiki.xen.org/wiki/SubmittingXenPatches
It's a few minutes to set up, but it's well worth it both for us and for
you.
Thanks,
-George
On 29/10/12 16:21, Liu, Jinsong wrote:
> X86/vMCE: handle broken page occurred before migration
>
> This patch handles guest broken page which occur before migration.
>
> At sender, the broken page would be mapped but not copied to target
> (otherwise it may trigger more serious error, say, SRAR error).
> While its pfn_type and pfn number would be transferred to target
> so that target take appropriate action.
>
> At target, it would set p2m as p2m_ram_broken for broken page, so that
> if guest access the broken page again, it would kill itself as expected.
>
> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>
> diff -r e27a6d53ac15 tools/libxc/xc_domain.c
> --- a/tools/libxc/xc_domain.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain.c Thu Oct 25 05:49:10 2012 +0800
> @@ -283,6 +283,22 @@
> return ret;
> }
>
> +/* set broken page p2m */
> +int xc_set_broken_page_p2m(xc_interface *xch,
> + uint32_t domid,
> + unsigned long pfn)
> +{
> + int ret;
> + DECLARE_DOMCTL;
> +
> + domctl.cmd = XEN_DOMCTL_set_broken_page_p2m;
> + domctl.domain = (domid_t)domid;
> + domctl.u.set_broken_page_p2m.pfn = pfn;
> + ret = do_domctl(xch, &domctl);
> +
> + return ret ? -1 : 0;
> +}
> +
> /* get info from hvm guest for save */
> int xc_domain_hvm_getcontext(xc_interface *xch,
> uint32_t domid,
> diff -r e27a6d53ac15 tools/libxc/xc_domain_restore.c
> --- a/tools/libxc/xc_domain_restore.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain_restore.c Thu Oct 25 05:49:10 2012 +0800
> @@ -962,9 +962,15 @@
>
> countpages = count;
> for (i = oldcount; i < buf->nr_pages; ++i)
> - if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB
> - ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XALLOC)
> + {
> + unsigned long pagetype;
> +
> + pagetype = buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK;
> + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ||
> + pagetype == XEN_DOMCTL_PFINFO_BROKEN ||
> + pagetype == XEN_DOMCTL_PFINFO_XALLOC )
> --countpages;
> + }
>
> if (!countpages)
> return count;
> @@ -1200,6 +1206,17 @@
> /* a bogus/unmapped/allocate-only page: skip it */
> continue;
>
> + if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN )
> + {
> + if ( xc_set_broken_page_p2m(xch, dom, pfn) )
> + {
> + ERROR("Set p2m for broken page failed, "
> + "dom=%d, pfn=%lx\n", dom, pfn);
> + goto err_mapped;
> + }
> + continue;
> + }
> +
> if (pfn_err[i])
> {
> ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx p2m_mfn %lx",
> diff -r e27a6d53ac15 tools/libxc/xc_domain_save.c
> --- a/tools/libxc/xc_domain_save.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:10 2012 +0800
> @@ -1277,6 +1277,13 @@
> if ( !hvm )
> gmfn = pfn_to_mfn(gmfn);
>
> + if ( pfn_type[j] == XEN_DOMCTL_PFINFO_BROKEN )
> + {
> + pfn_type[j] |= pfn_batch[j];
> + ++run;
> + continue;
> + }
> +
> if ( pfn_err[j] )
> {
> if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB )
> @@ -1371,8 +1378,12 @@
> }
> }
>
> - /* skip pages that aren't present or are alloc-only */
> + /*
> + * skip pages that aren't present,
> + * or are broken, or are alloc-only
> + */
> if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
> + || pagetype == XEN_DOMCTL_PFINFO_BROKEN
> || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
> continue;
>
> diff -r e27a6d53ac15 tools/libxc/xenctrl.h
> --- a/tools/libxc/xenctrl.h Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xenctrl.h Thu Oct 25 05:49:10 2012 +0800
> @@ -575,6 +575,17 @@
> xc_domaininfo_t *info);
>
> /**
> + * This function set p2m for broken page
> + * &parm xch a handle to an open hypervisor interface
> + * @parm domid the domain id which broken page belong to
> + * @parm pfn the pfn number of the broken page
> + * @return 0 on success, -1 on failure
> + */
> +int xc_set_broken_page_p2m(xc_interface *xch,
> + uint32_t domid,
> + unsigned long pfn);
> +
> +/**
> * This function returns information about the context of a hvm domain
> * @parm xch a handle to an open hypervisor interface
> * @parm domid the domain to get information from
> diff -r e27a6d53ac15 xen/arch/x86/domctl.c
> --- a/xen/arch/x86/domctl.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/xen/arch/x86/domctl.c Thu Oct 25 05:49:10 2012 +0800
> @@ -209,12 +209,18 @@
> for ( j = 0; j < k; j++ )
> {
> unsigned long type = 0;
> + p2m_type_t t;
>
> - page = get_page_from_gfn(d, arr[j], NULL, P2M_ALLOC);
> + page = get_page_from_gfn(d, arr[j], &t, P2M_ALLOC);
>
> if ( unlikely(!page) ||
> unlikely(is_xen_heap_page(page)) )
> - type = XEN_DOMCTL_PFINFO_XTAB;
> + {
> + if ( p2m_is_broken(t) )
> + type = XEN_DOMCTL_PFINFO_BROKEN;
> + else
> + type = XEN_DOMCTL_PFINFO_XTAB;
> + }
> else
> {
> switch( page->u.inuse.type_info & PGT_type_mask )
> @@ -235,6 +241,9 @@
>
> if ( page->u.inuse.type_info & PGT_pinned )
> type |= XEN_DOMCTL_PFINFO_LPINTAB;
> +
> + if ( page->count_info & PGC_broken )
> + type = XEN_DOMCTL_PFINFO_BROKEN;
> }
>
> if ( page )
> @@ -1568,6 +1577,28 @@
> }
> break;
>
> + case XEN_DOMCTL_set_broken_page_p2m:
> + {
> + struct domain *d;
> + p2m_type_t pt;
> + unsigned long pfn;
> +
> + d = rcu_lock_domain_by_id(domctl->domain);
> + if ( d != NULL )
> + {
> + pfn = domctl->u.set_broken_page_p2m.pfn;
> +
> + get_gfn_query(d, pfn, &pt);
> + p2m_change_type(d, pfn, pt, p2m_ram_broken);
> + put_gfn(d, pfn);
> +
> + rcu_unlock_domain(d);
> + }
> + else
> + ret = -ESRCH;
> + }
> + break;
> +
> default:
> ret = iommu_do_domctl(domctl, u_domctl);
> break;
> diff -r e27a6d53ac15 xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h Thu Oct 11 01:52:33 2012 +0800
> +++ b/xen/include/public/domctl.h Thu Oct 25 05:49:10 2012 +0800
> @@ -136,6 +136,7 @@
> #define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31)
> #define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid page */
> #define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /* allocate-only page */
> +#define XEN_DOMCTL_PFINFO_BROKEN (0xdU<<28) /* broken page */
> #define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
> #define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
>
> @@ -835,6 +836,12 @@
> typedef struct xen_domctl_set_access_required xen_domctl_set_access_required_t;
> DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
>
> +struct xen_domctl_set_broken_page_p2m {
> + uint64_aligned_t pfn;
> +};
> +typedef struct xen_domctl_set_broken_page_p2m xen_domctl_set_broken_page_p2m_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_broken_page_p2m_t);
> +
> struct xen_domctl {
> uint32_t cmd;
> #define XEN_DOMCTL_createdomain 1
> @@ -900,6 +907,7 @@
> #define XEN_DOMCTL_set_access_required 64
> #define XEN_DOMCTL_audit_p2m 65
> #define XEN_DOMCTL_set_virq_handler 66
> +#define XEN_DOMCTL_set_broken_page_p2m 67
> #define XEN_DOMCTL_gdbsx_guestmemio 1000
> #define XEN_DOMCTL_gdbsx_pausevcpu 1001
> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002
> @@ -955,6 +963,7 @@
> struct xen_domctl_audit_p2m audit_p2m;
> struct xen_domctl_set_virq_handler set_virq_handler;
> struct xen_domctl_gdbsx_memio gdbsx_guest_memio;
> + struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
> struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
> uint8_t pad[128];
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
` (2 preceding siblings ...)
2012-10-30 9:25 ` George Dunlap
@ 2012-10-30 9:27 ` George Dunlap
2012-10-31 10:58 ` Liu, Jinsong
3 siblings, 1 reply; 23+ messages in thread
From: George Dunlap @ 2012-10-30 9:27 UTC (permalink / raw)
To: Liu, Jinsong
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Ian Jackson, Jan Beulich
Jinsong,
I'm at UDS now, but I'll try to review the new patches in the next few days.
If you end up sending these patches again, could you please send them in
a more normal "patchbomb-style" format? I.e., with a "00/02" header
describing what's new in the series, and then naming them 01/02 and
02/02 (instead of 4 and 5, when 1-3 have been applied for months)?
The easiest way to do this is to use hg's patchbomb extension; there's a
description of how to set it up here:
http://wiki.xen.org/wiki/SubmittingXenPatches
It's a few minutes to set up, but it's well worth it both for us and for
you.
Thanks,
-George
On Mon, Oct 29, 2012 at 4:21 PM, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> X86/vMCE: handle broken page occurred before migration
>
> This patch handles guest broken page which occur before migration.
>
> At sender, the broken page would be mapped but not copied to target
> (otherwise it may trigger more serious error, say, SRAR error).
> While its pfn_type and pfn number would be transferred to target
> so that target take appropriate action.
>
> At target, it would set p2m as p2m_ram_broken for broken page, so that
> if guest access the broken page again, it would kill itself as expected.
>
> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>
> diff -r e27a6d53ac15 tools/libxc/xc_domain.c
> --- a/tools/libxc/xc_domain.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain.c Thu Oct 25 05:49:10 2012 +0800
> @@ -283,6 +283,22 @@
> return ret;
> }
>
> +/* set broken page p2m */
> +int xc_set_broken_page_p2m(xc_interface *xch,
> + uint32_t domid,
> + unsigned long pfn)
> +{
> + int ret;
> + DECLARE_DOMCTL;
> +
> + domctl.cmd = XEN_DOMCTL_set_broken_page_p2m;
> + domctl.domain = (domid_t)domid;
> + domctl.u.set_broken_page_p2m.pfn = pfn;
> + ret = do_domctl(xch, &domctl);
> +
> + return ret ? -1 : 0;
> +}
> +
> /* get info from hvm guest for save */
> int xc_domain_hvm_getcontext(xc_interface *xch,
> uint32_t domid,
> diff -r e27a6d53ac15 tools/libxc/xc_domain_restore.c
> --- a/tools/libxc/xc_domain_restore.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain_restore.c Thu Oct 25 05:49:10 2012 +0800
> @@ -962,9 +962,15 @@
>
> countpages = count;
> for (i = oldcount; i < buf->nr_pages; ++i)
> - if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB
> - ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XALLOC)
> + {
> + unsigned long pagetype;
> +
> + pagetype = buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK;
> + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ||
> + pagetype == XEN_DOMCTL_PFINFO_BROKEN ||
> + pagetype == XEN_DOMCTL_PFINFO_XALLOC )
> --countpages;
> + }
>
> if (!countpages)
> return count;
> @@ -1200,6 +1206,17 @@
> /* a bogus/unmapped/allocate-only page: skip it */
> continue;
>
> + if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN )
> + {
> + if ( xc_set_broken_page_p2m(xch, dom, pfn) )
> + {
> + ERROR("Set p2m for broken page failed, "
> + "dom=%d, pfn=%lx\n", dom, pfn);
> + goto err_mapped;
> + }
> + continue;
> + }
> +
> if (pfn_err[i])
> {
> ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx p2m_mfn %lx",
> diff -r e27a6d53ac15 tools/libxc/xc_domain_save.c
> --- a/tools/libxc/xc_domain_save.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:10 2012 +0800
> @@ -1277,6 +1277,13 @@
> if ( !hvm )
> gmfn = pfn_to_mfn(gmfn);
>
> + if ( pfn_type[j] == XEN_DOMCTL_PFINFO_BROKEN )
> + {
> + pfn_type[j] |= pfn_batch[j];
> + ++run;
> + continue;
> + }
> +
> if ( pfn_err[j] )
> {
> if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB )
> @@ -1371,8 +1378,12 @@
> }
> }
>
> - /* skip pages that aren't present or are alloc-only */
> + /*
> + * skip pages that aren't present,
> + * or are broken, or are alloc-only
> + */
> if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
> + || pagetype == XEN_DOMCTL_PFINFO_BROKEN
> || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
> continue;
>
> diff -r e27a6d53ac15 tools/libxc/xenctrl.h
> --- a/tools/libxc/xenctrl.h Thu Oct 11 01:52:33 2012 +0800
> +++ b/tools/libxc/xenctrl.h Thu Oct 25 05:49:10 2012 +0800
> @@ -575,6 +575,17 @@
> xc_domaininfo_t *info);
>
> /**
> + * This function set p2m for broken page
> + * &parm xch a handle to an open hypervisor interface
> + * @parm domid the domain id which broken page belong to
> + * @parm pfn the pfn number of the broken page
> + * @return 0 on success, -1 on failure
> + */
> +int xc_set_broken_page_p2m(xc_interface *xch,
> + uint32_t domid,
> + unsigned long pfn);
> +
> +/**
> * This function returns information about the context of a hvm domain
> * @parm xch a handle to an open hypervisor interface
> * @parm domid the domain to get information from
> diff -r e27a6d53ac15 xen/arch/x86/domctl.c
> --- a/xen/arch/x86/domctl.c Thu Oct 11 01:52:33 2012 +0800
> +++ b/xen/arch/x86/domctl.c Thu Oct 25 05:49:10 2012 +0800
> @@ -209,12 +209,18 @@
> for ( j = 0; j < k; j++ )
> {
> unsigned long type = 0;
> + p2m_type_t t;
>
> - page = get_page_from_gfn(d, arr[j], NULL, P2M_ALLOC);
> + page = get_page_from_gfn(d, arr[j], &t, P2M_ALLOC);
>
> if ( unlikely(!page) ||
> unlikely(is_xen_heap_page(page)) )
> - type = XEN_DOMCTL_PFINFO_XTAB;
> + {
> + if ( p2m_is_broken(t) )
> + type = XEN_DOMCTL_PFINFO_BROKEN;
> + else
> + type = XEN_DOMCTL_PFINFO_XTAB;
> + }
> else
> {
> switch( page->u.inuse.type_info & PGT_type_mask )
> @@ -235,6 +241,9 @@
>
> if ( page->u.inuse.type_info & PGT_pinned )
> type |= XEN_DOMCTL_PFINFO_LPINTAB;
> +
> + if ( page->count_info & PGC_broken )
> + type = XEN_DOMCTL_PFINFO_BROKEN;
> }
>
> if ( page )
> @@ -1568,6 +1577,28 @@
> }
> break;
>
> + case XEN_DOMCTL_set_broken_page_p2m:
> + {
> + struct domain *d;
> + p2m_type_t pt;
> + unsigned long pfn;
> +
> + d = rcu_lock_domain_by_id(domctl->domain);
> + if ( d != NULL )
> + {
> + pfn = domctl->u.set_broken_page_p2m.pfn;
> +
> + get_gfn_query(d, pfn, &pt);
> + p2m_change_type(d, pfn, pt, p2m_ram_broken);
> + put_gfn(d, pfn);
> +
> + rcu_unlock_domain(d);
> + }
> + else
> + ret = -ESRCH;
> + }
> + break;
> +
> default:
> ret = iommu_do_domctl(domctl, u_domctl);
> break;
> diff -r e27a6d53ac15 xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h Thu Oct 11 01:52:33 2012 +0800
> +++ b/xen/include/public/domctl.h Thu Oct 25 05:49:10 2012 +0800
> @@ -136,6 +136,7 @@
> #define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31)
> #define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid page */
> #define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /* allocate-only page */
> +#define XEN_DOMCTL_PFINFO_BROKEN (0xdU<<28) /* broken page */
> #define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
> #define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
>
> @@ -835,6 +836,12 @@
> typedef struct xen_domctl_set_access_required xen_domctl_set_access_required_t;
> DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
>
> +struct xen_domctl_set_broken_page_p2m {
> + uint64_aligned_t pfn;
> +};
> +typedef struct xen_domctl_set_broken_page_p2m xen_domctl_set_broken_page_p2m_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_broken_page_p2m_t);
> +
> struct xen_domctl {
> uint32_t cmd;
> #define XEN_DOMCTL_createdomain 1
> @@ -900,6 +907,7 @@
> #define XEN_DOMCTL_set_access_required 64
> #define XEN_DOMCTL_audit_p2m 65
> #define XEN_DOMCTL_set_virq_handler 66
> +#define XEN_DOMCTL_set_broken_page_p2m 67
> #define XEN_DOMCTL_gdbsx_guestmemio 1000
> #define XEN_DOMCTL_gdbsx_pausevcpu 1001
> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002
> @@ -955,6 +963,7 @@
> struct xen_domctl_audit_p2m audit_p2m;
> struct xen_domctl_set_virq_handler set_virq_handler;
> struct xen_domctl_gdbsx_memio gdbsx_guest_memio;
> + struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
> struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
> uint8_t pad[128];
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Patch 4/5] X86/vMCE: handle broken page occurred before migration
2012-10-30 9:27 ` George Dunlap
@ 2012-10-31 10:58 ` Liu, Jinsong
0 siblings, 0 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-31 10:58 UTC (permalink / raw)
To: George Dunlap
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Ian Jackson, Jan Beulich
George Dunlap wrote:
> Jinsong,
>
> I'm at UDS now, but I'll try to review the new patches in the next
> few days.
>
> If you end up sending these patches again, could you please send them
> in
> a more normal "patchbomb-style" format? I.e., with a "00/02" header
> describing what's new in the series, and then naming them 01/02 and
> 02/02 (instead of 4 and 5, when 1-3 have been applied for months)?
>
> The easiest way to do this is to use hg's patchbomb extension;
> there's a description of how to set it up here:
>
> http://wiki.xen.org/wiki/SubmittingXenPatches
>
> It's a few minutes to set up, but it's well worth it both for us and
> for you.
>
> Thanks,
> -George
Thanks, just updated per Jan's comments, and will re-send them in patchbomb style.
Jinsong
>
> On Mon, Oct 29, 2012 at 4:21 PM, Liu, Jinsong <jinsong.liu@intel.com>
> wrote:
>> X86/vMCE: handle broken page occurred before migration
>>
>> This patch handles guest broken page which occur before migration.
>>
>> At sender, the broken page would be mapped but not copied to target
>> (otherwise it may trigger more serious error, say, SRAR error).
>> While its pfn_type and pfn number would be transferred to target
>> so that target take appropriate action.
>>
>> At target, it would set p2m as p2m_ram_broken for broken page, so
>> that
>> if guest access the broken page again, it would kill itself as
>> expected.
>>
>> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>>
>> diff -r e27a6d53ac15 tools/libxc/xc_domain.c
>> --- a/tools/libxc/xc_domain.c Thu Oct 11 01:52:33 2012 +0800
>> +++ b/tools/libxc/xc_domain.c Thu Oct 25 05:49:10 2012 +0800 @@
>> -283,6 +283,22 @@ return ret;
>> }
>>
>> +/* set broken page p2m */
>> +int xc_set_broken_page_p2m(xc_interface *xch,
>> + uint32_t domid,
>> + unsigned long pfn)
>> +{
>> + int ret;
>> + DECLARE_DOMCTL;
>> +
>> + domctl.cmd = XEN_DOMCTL_set_broken_page_p2m;
>> + domctl.domain = (domid_t)domid;
>> + domctl.u.set_broken_page_p2m.pfn = pfn;
>> + ret = do_domctl(xch, &domctl);
>> +
>> + return ret ? -1 : 0;
>> +}
>> +
>> /* get info from hvm guest for save */
>> int xc_domain_hvm_getcontext(xc_interface *xch,
>> uint32_t domid,
>> diff -r e27a6d53ac15 tools/libxc/xc_domain_restore.c
>> --- a/tools/libxc/xc_domain_restore.c Thu Oct 11 01:52:33 2012
>> +0800 +++ b/tools/libxc/xc_domain_restore.c Thu Oct 25 05:49:10
>> 2012 +0800 @@ -962,9 +962,15 @@
>>
>> countpages = count;
>> for (i = oldcount; i < buf->nr_pages; ++i)
>> - if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) ==
>> XEN_DOMCTL_PFINFO_XTAB
>> - ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) ==
>> XEN_DOMCTL_PFINFO_XALLOC) + { + unsigned long pagetype;
>> +
>> + pagetype = buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK;
>> + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ||
>> + pagetype == XEN_DOMCTL_PFINFO_BROKEN ||
>> + pagetype == XEN_DOMCTL_PFINFO_XALLOC )
>> --countpages; + }
>>
>> if (!countpages)
>> return count;
>> @@ -1200,6 +1206,17 @@
>> /* a bogus/unmapped/allocate-only page: skip it */
>> continue;
>>
>> + if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN ) + {
>> + if ( xc_set_broken_page_p2m(xch, dom, pfn) ) +
>> { + ERROR("Set p2m for broken page failed, "
>> + "dom=%d, pfn=%lx\n", dom, pfn);
>> + goto err_mapped;
>> + }
>> + continue;
>> + }
>> +
>> if (pfn_err[i])
>> {
>> ERROR("unexpected PFN mapping failure pfn %lx map_mfn
>> %lx p2m_mfn %lx",
>> diff -r e27a6d53ac15 tools/libxc/xc_domain_save.c
>> --- a/tools/libxc/xc_domain_save.c Thu Oct 11 01:52:33 2012
>> +0800 +++ b/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:10
>> 2012 +0800 @@ -1277,6 +1277,13 @@ if ( !hvm )
>> gmfn = pfn_to_mfn(gmfn);
>>
>> + if ( pfn_type[j] == XEN_DOMCTL_PFINFO_BROKEN ) +
>> { + pfn_type[j] |= pfn_batch[j];
>> + ++run;
>> + continue;
>> + }
>> +
>> if ( pfn_err[j] )
>> {
>> if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB ) @@
>> -1371,8 +1378,12 @@ }
>> }
>>
>> - /* skip pages that aren't present or are alloc-only
>> */ + /* + * skip pages that aren't
>> present, + * or are broken, or are alloc-only +
>> */ if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
>> + || pagetype == XEN_DOMCTL_PFINFO_BROKEN
>> || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
>> continue;
>>
>> diff -r e27a6d53ac15 tools/libxc/xenctrl.h
>> --- a/tools/libxc/xenctrl.h Thu Oct 11 01:52:33 2012 +0800
>> +++ b/tools/libxc/xenctrl.h Thu Oct 25 05:49:10 2012 +0800 @@
>> -575,6 +575,17 @@ xc_domaininfo_t *info);
>>
>> /**
>> + * This function set p2m for broken page
>> + * &parm xch a handle to an open hypervisor interface
>> + * @parm domid the domain id which broken page belong to
>> + * @parm pfn the pfn number of the broken page
>> + * @return 0 on success, -1 on failure
>> + */
>> +int xc_set_broken_page_p2m(xc_interface *xch,
>> + uint32_t domid,
>> + unsigned long pfn);
>> +
>> +/**
>> * This function returns information about the context of a hvm
>> domain
>> * @parm xch a handle to an open hypervisor interface
>> * @parm domid the domain to get information from
>> diff -r e27a6d53ac15 xen/arch/x86/domctl.c
>> --- a/xen/arch/x86/domctl.c Thu Oct 11 01:52:33 2012 +0800
>> +++ b/xen/arch/x86/domctl.c Thu Oct 25 05:49:10 2012 +0800 @@
>> -209,12 +209,18 @@ for ( j = 0; j < k; j++ )
>> {
>> unsigned long type = 0;
>> + p2m_type_t t;
>>
>> - page = get_page_from_gfn(d, arr[j], NULL,
>> P2M_ALLOC); + page = get_page_from_gfn(d, arr[j],
>> &t, P2M_ALLOC);
>>
>> if ( unlikely(!page) ||
>> unlikely(is_xen_heap_page(page)) )
>> - type = XEN_DOMCTL_PFINFO_XTAB; +
>> { + if ( p2m_is_broken(t) )
>> + type = XEN_DOMCTL_PFINFO_BROKEN; +
>> else + type = XEN_DOMCTL_PFINFO_XTAB; +
>> } else
>> {
>> switch( page->u.inuse.type_info &
>> PGT_type_mask ) @@ -235,6 +241,9 @@
>>
>> if ( page->u.inuse.type_info & PGT_pinned )
>> type |= XEN_DOMCTL_PFINFO_LPINTAB; +
>> + if ( page->count_info & PGC_broken )
>> + type = XEN_DOMCTL_PFINFO_BROKEN;
>> }
>>
>> if ( page )
>> @@ -1568,6 +1577,28 @@
>> }
>> break;
>>
>> + case XEN_DOMCTL_set_broken_page_p2m:
>> + {
>> + struct domain *d;
>> + p2m_type_t pt;
>> + unsigned long pfn;
>> +
>> + d = rcu_lock_domain_by_id(domctl->domain); + if ( d
>> != NULL ) + {
>> + pfn = domctl->u.set_broken_page_p2m.pfn; +
>> + get_gfn_query(d, pfn, &pt);
>> + p2m_change_type(d, pfn, pt, p2m_ram_broken); +
>> put_gfn(d, pfn); +
>> + rcu_unlock_domain(d);
>> + }
>> + else
>> + ret = -ESRCH;
>> + }
>> + break;
>> +
>> default:
>> ret = iommu_do_domctl(domctl, u_domctl);
>> break;
>> diff -r e27a6d53ac15 xen/include/public/domctl.h
>> --- a/xen/include/public/domctl.h Thu Oct 11 01:52:33 2012
>> +0800 +++ b/xen/include/public/domctl.h Thu Oct 25 05:49:10
>> 2012 +0800 @@ -136,6 +136,7 @@ #define XEN_DOMCTL_PFINFO_LPINTAB
>> (0x1U<<31) #define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid
>> page */ #define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /*
>> allocate-only page */ +#define XEN_DOMCTL_PFINFO_BROKEN (0xdU<<28)
>> /* broken page */ #define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
>> #define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
>>
>> @@ -835,6 +836,12 @@
>> typedef struct xen_domctl_set_access_required
>> xen_domctl_set_access_required_t;
>> DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
>>
>> +struct xen_domctl_set_broken_page_p2m {
>> + uint64_aligned_t pfn;
>> +};
>> +typedef struct xen_domctl_set_broken_page_p2m
>> xen_domctl_set_broken_page_p2m_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_broken_page_p2m_t); +
>> struct xen_domctl { uint32_t cmd;
>> #define XEN_DOMCTL_createdomain 1 @@ -900,6
>> +907,7 @@ #define XEN_DOMCTL_set_access_required 64
>> #define XEN_DOMCTL_audit_p2m 65
>> #define XEN_DOMCTL_set_virq_handler 66
>> +#define XEN_DOMCTL_set_broken_page_p2m 67
>> #define XEN_DOMCTL_gdbsx_guestmemio 1000
>> #define XEN_DOMCTL_gdbsx_pausevcpu 1001
>> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -955,6
>> +963,7 @@ struct xen_domctl_audit_p2m audit_p2m;
>> struct xen_domctl_set_virq_handler set_virq_handler;
>> struct xen_domctl_gdbsx_memio gdbsx_guest_memio;
>> + struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
>> struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
>> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
>> uint8_t pad[128];
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 5/5] Xen/MCE: handle broken page occurs during migration
2012-10-22 11:32 ` George Dunlap
2012-10-24 14:30 ` Liu, Jinsong
2012-10-29 15:21 ` [Patch 4/5] X86/vMCE: handle broken page occurred before migration Liu, Jinsong
@ 2012-10-29 15:22 ` Liu, Jinsong
2 siblings, 0 replies; 23+ messages in thread
From: Liu, Jinsong @ 2012-10-29 15:22 UTC (permalink / raw)
To: George Dunlap, Ian Jackson
Cc: Christoph Egger, xen-devel@lists.xensource.com, Keir (Xen.org),
Ian Campbell, Jan Beulich
[-- Attachment #1: Type: text/plain, Size: 6934 bytes --]
Xen/MCE: handle broken page occurs during migration
This patch handles broken page which occurs during migration.
It monitors the critical area of live migration (from vMCE point of view,
the copypages stage of migration is the critical area while other areas are not).
If a vMCE occur at the critical area of live migration, it marks the broken page
to dirty map, so that at copypages stage of migration, its pfn_type
and pfn number would transfer to target and then take appropriate action.
At target, it would set p2m as p2m_ram_broken for broken page, so that if
guest access the broken page again, it would kill itself as expected.
Suggested-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
diff -r 3313ee9f6142 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xc_domain.c Tue Oct 30 06:07:05 2012 +0800
@@ -299,6 +299,24 @@
return ret ? -1 : 0;
}
+/* start/end vmce monitor */
+int xc_domain_vmce_monitor(xc_interface *xch,
+ uint32_t domid,
+ uint32_t start)
+{
+ int ret;
+ DECLARE_DOMCTL;
+
+ if ( start )
+ domctl.cmd = XEN_DOMCTL_vmce_monitor_start;
+ else
+ domctl.cmd = XEN_DOMCTL_vmce_monitor_end;
+ domctl.domain = (domid_t)domid;
+ ret = do_domctl(xch, &domctl);
+
+ return ret ? -1 : 0;
+}
+
/* get info from hvm guest for save */
int xc_domain_hvm_getcontext(xc_interface *xch,
uint32_t domid,
diff -r 3313ee9f6142 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xc_domain_save.c Tue Oct 30 06:07:05 2012 +0800
@@ -1109,6 +1109,13 @@
goto out;
}
+ /* Start vmce monitor */
+ if ( xc_domain_vmce_monitor(xch, dom, 1) )
+ {
+ PERROR("Error starting vmce monitor");
+ goto out;
+ }
+
copypages:
#define wrexact(fd, buf, len) write_buffer(xch, last_iter, ob, (fd), (buf), (len))
#define wruncached(fd, live, buf, len) write_uncached(xch, last_iter, ob, (fd), (buf), (len))
@@ -1582,6 +1589,13 @@
DPRINTF("All memory is saved\n");
+ /* End vmce monitor */
+ if ( xc_domain_vmce_monitor(xch, dom, 0) )
+ {
+ PERROR("Error ending vmce monitor");
+ goto out;
+ }
+
/* After last_iter, buffer the rest of pagebuf & tailbuf data into a
* separate output buffer and flush it after the compressed page chunks.
*/
diff -r 3313ee9f6142 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xenctrl.h Tue Oct 30 06:07:05 2012 +0800
@@ -586,6 +586,17 @@
unsigned long pfn);
/**
+ * This function start/end monitor vmce event.
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid the domain id monitored
+ * @parm flag to start/end monitor
+ * @return <0 on failure, 0 on success
+ */
+int xc_domain_vmce_monitor(xc_interface *xch,
+ uint32_t domid,
+ uint32_t start);
+
+/**
* This function returns information about the context of a hvm domain
* @parm xch a handle to an open hypervisor interface
* @parm domid the domain to get information from
diff -r 3313ee9f6142 xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c Tue Oct 30 06:07:05 2012 +0800
@@ -342,6 +342,22 @@
goto vmce_failed;
}
+ if ( unlikely(d->arch.vmce_monitor) )
+ {
+ /*
+ * vMCE occur during migration
+ *
+ * mark broken page to dirty bitmap, so that at copypages
+ * stage of migration, its pfn_type and pfn number would
+ * transfer to target and then take appropriate action
+ *
+ * At target, it would set p2m as p2m_ram_broken for broken
+ * page, so that if guest access the broken page again, it
+ * would kill itself as expected.
+ */
+ paging_mark_dirty(d, mfn);
+ }
+
if ( unmmap_broken_page(d, _mfn(mfn), gfn) )
{
printk("Unmap broken memory %lx for DOM%d failed\n",
diff -r 3313ee9f6142 xen/arch/x86/domctl.c
--- a/xen/arch/x86/domctl.c Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/arch/x86/domctl.c Tue Oct 30 06:07:05 2012 +0800
@@ -1599,6 +1599,44 @@
}
break;
+ case XEN_DOMCTL_vmce_monitor_start:
+ {
+ struct domain *d;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL )
+ {
+ if ( d->arch.vmce_monitor )
+ ret = -EBUSY;
+ else
+ d->arch.vmce_monitor = 1;
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
+ case XEN_DOMCTL_vmce_monitor_end:
+ {
+ struct domain *d;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL)
+ {
+ if ( !d->arch.vmce_monitor )
+ ret = -EINVAL;
+ else
+ d->arch.vmce_monitor = 0;
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
default:
ret = iommu_do_domctl(domctl, u_domctl);
break;
diff -r 3313ee9f6142 xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/include/asm-x86/domain.h Tue Oct 30 06:07:05 2012 +0800
@@ -279,6 +279,10 @@
bool_t has_32bit_shinfo;
/* Domain cannot handle spurious page faults? */
bool_t suppress_spurious_page_faults;
+ /* Monitoring guest memory copy of migration
+ * = 0 - not monitoring
+ * = 1 - monitoring */
+ bool_t vmce_monitor;
/* Continuable domain_relinquish_resources(). */
enum {
diff -r 3313ee9f6142 xen/include/public/domctl.h
--- a/xen/include/public/domctl.h Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/include/public/domctl.h Tue Oct 30 06:07:05 2012 +0800
@@ -908,6 +908,8 @@
#define XEN_DOMCTL_audit_p2m 65
#define XEN_DOMCTL_set_virq_handler 66
#define XEN_DOMCTL_set_broken_page_p2m 67
+#define XEN_DOMCTL_vmce_monitor_start 68
+#define XEN_DOMCTL_vmce_monitor_end 69
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
[-- Attachment #2: 5_vmce_during_migration.patch --]
[-- Type: application/octet-stream, Size: 6739 bytes --]
Xen/MCE: handle broken page occurs during migration
This patch handles broken page which occurs during migration.
It monitors the critical area of live migration (from vMCE point of view,
the copypages stage of migration is the critical area while other areas are not).
If a vMCE occur at the critical area of live migration, it marks the broken page
to dirty map, so that at copypages stage of migration, its pfn_type
and pfn number would transfer to target and then take appropriate action.
At target, it would set p2m as p2m_ram_broken for broken page, so that if
guest access the broken page again, it would kill itself as expected.
Suggested-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
diff -r 3313ee9f6142 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xc_domain.c Tue Oct 30 06:07:05 2012 +0800
@@ -299,6 +299,24 @@
return ret ? -1 : 0;
}
+/* start/end vmce monitor */
+int xc_domain_vmce_monitor(xc_interface *xch,
+ uint32_t domid,
+ uint32_t start)
+{
+ int ret;
+ DECLARE_DOMCTL;
+
+ if ( start )
+ domctl.cmd = XEN_DOMCTL_vmce_monitor_start;
+ else
+ domctl.cmd = XEN_DOMCTL_vmce_monitor_end;
+ domctl.domain = (domid_t)domid;
+ ret = do_domctl(xch, &domctl);
+
+ return ret ? -1 : 0;
+}
+
/* get info from hvm guest for save */
int xc_domain_hvm_getcontext(xc_interface *xch,
uint32_t domid,
diff -r 3313ee9f6142 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xc_domain_save.c Tue Oct 30 06:07:05 2012 +0800
@@ -1109,6 +1109,13 @@
goto out;
}
+ /* Start vmce monitor */
+ if ( xc_domain_vmce_monitor(xch, dom, 1) )
+ {
+ PERROR("Error starting vmce monitor");
+ goto out;
+ }
+
copypages:
#define wrexact(fd, buf, len) write_buffer(xch, last_iter, ob, (fd), (buf), (len))
#define wruncached(fd, live, buf, len) write_uncached(xch, last_iter, ob, (fd), (buf), (len))
@@ -1582,6 +1589,13 @@
DPRINTF("All memory is saved\n");
+ /* End vmce monitor */
+ if ( xc_domain_vmce_monitor(xch, dom, 0) )
+ {
+ PERROR("Error ending vmce monitor");
+ goto out;
+ }
+
/* After last_iter, buffer the rest of pagebuf & tailbuf data into a
* separate output buffer and flush it after the compressed page chunks.
*/
diff -r 3313ee9f6142 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h Thu Oct 25 05:49:11 2012 +0800
+++ b/tools/libxc/xenctrl.h Tue Oct 30 06:07:05 2012 +0800
@@ -586,6 +586,17 @@
unsigned long pfn);
/**
+ * This function start/end monitor vmce event.
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid the domain id monitored
+ * @parm flag to start/end monitor
+ * @return <0 on failure, 0 on success
+ */
+int xc_domain_vmce_monitor(xc_interface *xch,
+ uint32_t domid,
+ uint32_t start);
+
+/**
* This function returns information about the context of a hvm domain
* @parm xch a handle to an open hypervisor interface
* @parm domid the domain to get information from
diff -r 3313ee9f6142 xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c Tue Oct 30 06:07:05 2012 +0800
@@ -342,6 +342,22 @@
goto vmce_failed;
}
+ if ( unlikely(d->arch.vmce_monitor) )
+ {
+ /*
+ * vMCE occur during migration
+ *
+ * mark broken page to dirty bitmap, so that at copypages
+ * stage of migration, its pfn_type and pfn number would
+ * transfer to target and then take appropriate action
+ *
+ * At target, it would set p2m as p2m_ram_broken for broken
+ * page, so that if guest access the broken page again, it
+ * would kill itself as expected.
+ */
+ paging_mark_dirty(d, mfn);
+ }
+
if ( unmmap_broken_page(d, _mfn(mfn), gfn) )
{
printk("Unmap broken memory %lx for DOM%d failed\n",
diff -r 3313ee9f6142 xen/arch/x86/domctl.c
--- a/xen/arch/x86/domctl.c Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/arch/x86/domctl.c Tue Oct 30 06:07:05 2012 +0800
@@ -1599,6 +1599,44 @@
}
break;
+ case XEN_DOMCTL_vmce_monitor_start:
+ {
+ struct domain *d;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL )
+ {
+ if ( d->arch.vmce_monitor )
+ ret = -EBUSY;
+ else
+ d->arch.vmce_monitor = 1;
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
+ case XEN_DOMCTL_vmce_monitor_end:
+ {
+ struct domain *d;
+
+ d = rcu_lock_domain_by_id(domctl->domain);
+ if ( d != NULL)
+ {
+ if ( !d->arch.vmce_monitor )
+ ret = -EINVAL;
+ else
+ d->arch.vmce_monitor = 0;
+
+ rcu_unlock_domain(d);
+ }
+ else
+ ret = -ESRCH;
+ }
+ break;
+
default:
ret = iommu_do_domctl(domctl, u_domctl);
break;
diff -r 3313ee9f6142 xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/include/asm-x86/domain.h Tue Oct 30 06:07:05 2012 +0800
@@ -279,6 +279,10 @@
bool_t has_32bit_shinfo;
/* Domain cannot handle spurious page faults? */
bool_t suppress_spurious_page_faults;
+ /* Monitoring guest memory copy of migration
+ * = 0 - not monitoring
+ * = 1 - monitoring */
+ bool_t vmce_monitor;
/* Continuable domain_relinquish_resources(). */
enum {
diff -r 3313ee9f6142 xen/include/public/domctl.h
--- a/xen/include/public/domctl.h Thu Oct 25 05:49:11 2012 +0800
+++ b/xen/include/public/domctl.h Tue Oct 30 06:07:05 2012 +0800
@@ -908,6 +908,8 @@
#define XEN_DOMCTL_audit_p2m 65
#define XEN_DOMCTL_set_virq_handler 66
#define XEN_DOMCTL_set_broken_page_p2m 67
+#define XEN_DOMCTL_vmce_monitor_start 68
+#define XEN_DOMCTL_vmce_monitor_end 69
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread