* [PATCH v4 1/9] xen: introduce DOMDYING_locked state
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 2/9] xen: introduce SHUTDOWN_soft_reset shutdown reason Vitaly Kuznetsov
` (10 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
New dying state is requred to indicate that a particular domain
is dying but cleanup procedure wasn't started. This state can be
set from outside of domain_kill().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
xen/common/domain.c | 1 +
xen/include/xen/sched.h | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 4a62c1d..c13a7cf 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -603,6 +603,7 @@ int domain_kill(struct domain *d)
switch ( d->is_dying )
{
case DOMDYING_alive:
+ case DOMDYING_locked:
domain_pause(d);
d->is_dying = DOMDYING_dying;
spin_barrier(&d->domain_lock);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 46fc6e3..a42d0b8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -369,7 +369,8 @@ struct domain
/* Is this guest being debugged by dom0? */
bool_t debugger_attached;
/* Is this guest dying (i.e., a zombie)? */
- enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying;
+ enum { DOMDYING_alive, DOMDYING_locked, DOMDYING_dying, DOMDYING_dead }
+ is_dying;
/* Domain is paused by controller software? */
int controller_pause_count;
/* Domain's VCPUs are pinned 1:1 to physical CPUs? */
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v4 2/9] xen: introduce SHUTDOWN_soft_reset shutdown reason
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 1/9] xen: introduce DOMDYING_locked state Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 3/9] libxl: support " Vitaly Kuznetsov
` (9 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
xen/common/shutdown.c | 7 +++++++
xen/include/public/sched.h | 3 ++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index 94d4c53..5c3a158 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -71,6 +71,13 @@ void hwdom_shutdown(u8 reason)
break; /* not reached */
}
+ case SHUTDOWN_soft_reset:
+ {
+ printk("Domain 0 did soft reset but it is unsupported, rebooting.\n");
+ machine_restart(0);
+ break; /* not reached */
+ }
+
default:
{
printk("Domain 0 shutdown (unknown reason %u): ", reason);
diff --git a/xen/include/public/sched.h b/xen/include/public/sched.h
index 4000ac9..800c808 100644
--- a/xen/include/public/sched.h
+++ b/xen/include/public/sched.h
@@ -159,7 +159,8 @@ DEFINE_XEN_GUEST_HANDLE(sched_watchdog_t);
#define SHUTDOWN_suspend 2 /* Clean up, save suspend info, kill. */
#define SHUTDOWN_crash 3 /* Tell controller we've crashed. */
#define SHUTDOWN_watchdog 4 /* Restart because watchdog time expired. */
-#define SHUTDOWN_MAX 4 /* Maximum valid shutdown reason. */
+#define SHUTDOWN_soft_reset 5 /* Soft reset, rebuild keeping memory content */
+#define SHUTDOWN_MAX 5 /* Maximum valid shutdown reason. */
/* ` } */
#endif /* __XEN_PUBLIC_SCHED_H__ */
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v4 3/9] libxl: support SHUTDOWN_soft_reset shutdown reason
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 1/9] xen: introduce DOMDYING_locked state Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 2/9] xen: introduce SHUTDOWN_soft_reset shutdown reason Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour Vitaly Kuznetsov
` (8 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Use letter 't' to indicate a domain in such state.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
tools/libxl/libxl_types.idl | 1 +
tools/libxl/xl_cmdimpl.c | 2 +-
tools/python/xen/lowlevel/xl/xl.c | 1 +
3 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index f7fc695..4a0e2be 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -175,6 +175,7 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [
(2, "suspend"),
(3, "crash"),
(4, "watchdog"),
+ (5, "soft_reset"),
], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN")
libxl_vga_interface_type = Enumeration("vga_interface_type", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 0e754e7..53611dc 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3497,7 +3497,7 @@ static void list_domains(int verbose, int context, int claim, int numa,
const libxl_dominfo *info, int nb_domain)
{
int i;
- static const char shutdown_reason_letters[]= "-rscw";
+ static const char shutdown_reason_letters[]= "-rscwt";
libxl_bitmap nodemap;
libxl_physinfo physinfo;
diff --git a/tools/python/xen/lowlevel/xl/xl.c b/tools/python/xen/lowlevel/xl/xl.c
index 32f982a..7c61160 100644
--- a/tools/python/xen/lowlevel/xl/xl.c
+++ b/tools/python/xen/lowlevel/xl/xl.c
@@ -784,6 +784,7 @@ PyMODINIT_FUNC initxl(void)
_INT_CONST_LIBXL(m, SHUTDOWN_REASON_SUSPEND);
_INT_CONST_LIBXL(m, SHUTDOWN_REASON_CRASH);
_INT_CONST_LIBXL(m, SHUTDOWN_REASON_WATCHDOG);
+ _INT_CONST_LIBXL(m, SHUTDOWN_REASON_SOFT_RESET);
genwrap__init(m);
}
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (2 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 3/9] libxl: support " Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-04 0:50 ` Julien Grall
2014-12-04 11:01 ` Julien Grall
2014-12-03 17:16 ` [PATCH v4 5/9] libxc: support XEN_DOMCTL_devour Vitaly Kuznetsov
` (7 subsequent siblings)
11 siblings, 2 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
New operation sets the 'recipient' domain which will recieve all
memory pages from a particular domain and kills the original domain.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
xen/common/domain.c | 3 +++
xen/common/domctl.c | 33 +++++++++++++++++++++++++++++++++
xen/common/page_alloc.c | 28 ++++++++++++++++++++++++----
xen/include/public/domctl.h | 15 +++++++++++++++
xen/include/xen/sched.h | 2 ++
5 files changed, 77 insertions(+), 4 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index c13a7cf..f26267a 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -825,6 +825,9 @@ static void complete_domain_destroy(struct rcu_head *head)
if ( d->target != NULL )
put_domain(d->target);
+ if ( d->recipient != NULL )
+ put_domain(d->recipient);
+
evtchn_destroy_final(d);
radix_tree_destroy(&d->pirq_tree, free_pirq_struct);
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index f15dcfe..7e7fb47 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1177,6 +1177,39 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
}
break;
+ case XEN_DOMCTL_devour:
+ {
+ struct domain *recipient_dom;
+
+ if ( !d->recipient )
+ {
+ recipient_dom = get_domain_by_id(op->u.devour.recipient);
+ if ( recipient_dom == NULL )
+ {
+ ret = -ESRCH;
+ break;
+ }
+
+ if ( recipient_dom->tot_pages != 0 )
+ {
+ put_domain(recipient_dom);
+ ret = -EINVAL;
+ break;
+ }
+ /*
+ * Make sure no allocation/remapping is ongoing and set is_dying
+ * flag to prevent such actions in future.
+ */
+ spin_lock(&d->page_alloc_lock);
+ d->is_dying = DOMDYING_locked;
+ d->recipient = recipient_dom;
+ smp_wmb(); /* make sure recipient was set before domain_kill() */
+ spin_unlock(&d->page_alloc_lock);
+ }
+ ret = domain_kill(d);
+ }
+ break;
+
default:
ret = arch_do_domctl(op, d, u_domctl);
break;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7b4092d..7eb4404 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1707,6 +1707,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
{
struct domain *d = page_get_owner(pg);
unsigned int i;
+ unsigned long mfn, gmfn;
bool_t drop_dom_ref;
ASSERT(!in_irq());
@@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
scrub = 1;
}
- if ( unlikely(scrub) )
- for ( i = 0; i < (1 << order); i++ )
- scrub_one_page(&pg[i]);
+ if ( !d || !d->recipient || d->recipient->is_dying )
+ {
+ if ( unlikely(scrub) )
+ for ( i = 0; i < (1 << order); i++ )
+ scrub_one_page(&pg[i]);
- free_heap_pages(pg, order);
+ free_heap_pages(pg, order);
+ }
+ else
+ {
+ mfn = page_to_mfn(pg);
+ gmfn = mfn_to_gmfn(d, mfn);
+
+ page_set_owner(pg, NULL);
+ if ( assign_pages(d->recipient, pg, order, 0) )
+ /* assign_pages reports the error by itself */
+ goto out;
+
+ if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )
+ printk(XENLOG_G_INFO
+ "Failed to add MFN %lx (GFN %lx) to Dom%d's physmap\n",
+ mfn, gmfn, d->recipient->domain_id);
+ }
}
+out:
if ( drop_dom_ref )
put_domain(d);
}
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 57e2ed7..871fa5e 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -995,6 +995,19 @@ struct xen_domctl_psr_cmt_op {
typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
+/*
+ * XEN_DOMCTL_devour - kills the domain reassigning all of its domheap pages
+ * to the 'recipient' domain. Pages from xen heap belonging to the domain
+ * are not copied. Reassigned pages are mapped to the same GMFNs in the
+ * recipient domain as they were mapped in the original. The recipient domain
+ * is supposed to not have any domheap pages to avoid MFN-GMFN collisions.
+ */
+struct xen_domctl_devour {
+ domid_t recipient;
+};
+typedef struct xen_domctl_devour xen_domctl_devour_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_devour_t);
+
struct xen_domctl {
uint32_t cmd;
#define XEN_DOMCTL_createdomain 1
@@ -1070,6 +1083,7 @@ struct xen_domctl {
#define XEN_DOMCTL_setvnumainfo 74
#define XEN_DOMCTL_psr_cmt_op 75
#define XEN_DOMCTL_arm_configure_domain 76
+#define XEN_DOMCTL_devour 77
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
@@ -1135,6 +1149,7 @@ struct xen_domctl {
struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
struct xen_domctl_vnuma vnuma;
struct xen_domctl_psr_cmt_op psr_cmt_op;
+ struct xen_domctl_devour devour;
uint8_t pad[128];
} u;
};
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index a42d0b8..552e4a3 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -366,6 +366,8 @@ struct domain
bool_t is_privileged;
/* Which guest this guest has privileges on */
struct domain *target;
+ /* Which guest receives freed memory pages */
+ struct domain *recipient;
/* Is this guest being debugged by dom0? */
bool_t debugger_attached;
/* Is this guest dying (i.e., a zombie)? */
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-03 17:16 ` [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour Vitaly Kuznetsov
@ 2014-12-04 0:50 ` Julien Grall
2014-12-04 10:19 ` David Vrabel
2014-12-04 15:12 ` Vitaly Kuznetsov
2014-12-04 11:01 ` Julien Grall
1 sibling, 2 replies; 28+ messages in thread
From: Julien Grall @ 2014-12-04 0:50 UTC (permalink / raw)
To: Vitaly Kuznetsov, xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Hi Vitaly,
On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
> New operation sets the 'recipient' domain which will recieve all
s/recieve/receive/
> memory pages from a particular domain and kills the original domain.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> @@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
[..]
> + else
> + {
> + mfn = page_to_mfn(pg);
> + gmfn = mfn_to_gmfn(d, mfn);
> +
> + page_set_owner(pg, NULL);
> + if ( assign_pages(d->recipient, pg, order, 0) )
> + /* assign_pages reports the error by itself */
> + goto out;
> +
> + if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )
On ARM, mfn_to_gmfn will always return the mfn. This would result to add
a 1:1 mapping in the recipient domain.
But ... only DOM0 has its memory mapped 1:1. So this code may blow up
the P2M of the recipient domain.
I'm not an x86 expert, but this may also happen when the recipient
domain is using translated page mode (i.e HVM/PVHM).
Regards,
--
Julien Grall
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-04 0:50 ` Julien Grall
@ 2014-12-04 10:19 ` David Vrabel
2014-12-04 10:52 ` Julien Grall
2014-12-04 15:12 ` Vitaly Kuznetsov
1 sibling, 1 reply; 28+ messages in thread
From: David Vrabel @ 2014-12-04 10:19 UTC (permalink / raw)
To: Julien Grall, Vitaly Kuznetsov, xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, Wei Liu
On 04/12/14 00:50, Julien Grall wrote:
> Hi Vitaly,
>
> On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
>> New operation sets the 'recipient' domain which will recieve all
>
> s/recieve/receive/
>
>> memory pages from a particular domain and kills the original domain.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> @@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg,
>> unsigned int order)
>
> [..]
>
>> + else
>> + {
>> + mfn = page_to_mfn(pg);
>> + gmfn = mfn_to_gmfn(d, mfn);
>> +
>> + page_set_owner(pg, NULL);
>> + if ( assign_pages(d->recipient, pg, order, 0) )
>> + /* assign_pages reports the error by itself */
>> + goto out;
>> +
>> + if ( guest_physmap_add_page(d->recipient, gmfn, mfn,
>> order) )
>
> On ARM, mfn_to_gmfn will always return the mfn. This would result to add
> a 1:1 mapping in the recipient domain.
>
> But ... only DOM0 has its memory mapped 1:1. So this code may blow up
> the P2M of the recipient domain.
>
> I'm not an x86 expert, but this may also happen when the recipient
> domain is using translated page mode (i.e HVM/PVHM).
mfn_to_gmfn() does the correct thing on x86 as it does a m2p lookup.
David
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-04 10:19 ` David Vrabel
@ 2014-12-04 10:52 ` Julien Grall
0 siblings, 0 replies; 28+ messages in thread
From: Julien Grall @ 2014-12-04 10:52 UTC (permalink / raw)
To: David Vrabel, Vitaly Kuznetsov, xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, Wei Liu
On 04/12/2014 10:19, David Vrabel wrote:
> On 04/12/14 00:50, Julien Grall wrote:
>> Hi Vitaly,
>>
>> On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
>>> New operation sets the 'recipient' domain which will recieve all
>>
>> s/recieve/receive/
>>
>>> memory pages from a particular domain and kills the original domain.
>>>
>>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> ---
>>> @@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg,
>>> unsigned int order)
>>
>> [..]
>>
>>> + else
>>> + {
>>> + mfn = page_to_mfn(pg);
>>> + gmfn = mfn_to_gmfn(d, mfn);
>>> +
>>> + page_set_owner(pg, NULL);
>>> + if ( assign_pages(d->recipient, pg, order, 0) )
>>> + /* assign_pages reports the error by itself */
>>> + goto out;
>>> +
>>> + if ( guest_physmap_add_page(d->recipient, gmfn, mfn,
>>> order) )
>>
>> On ARM, mfn_to_gmfn will always return the mfn. This would result to add
>> a 1:1 mapping in the recipient domain.
>>
>> But ... only DOM0 has its memory mapped 1:1. So this code may blow up
>> the P2M of the recipient domain.
>>
>> I'm not an x86 expert, but this may also happen when the recipient
>> domain is using translated page mode (i.e HVM/PVHM).
>
> mfn_to_gmfn() does the correct thing on x86 as it does a m2p lookup.
Is it because machine_to_phys_mapping caches the translation for dying
domain?
--
Julien Grall
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-04 0:50 ` Julien Grall
2014-12-04 10:19 ` David Vrabel
@ 2014-12-04 15:12 ` Vitaly Kuznetsov
2014-12-04 15:54 ` Julien Grall
1 sibling, 1 reply; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-04 15:12 UTC (permalink / raw)
To: Julien Grall
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
Julien Grall <julien.grall@linaro.org> writes:
> Hi Vitaly,
>
> On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
>> New operation sets the 'recipient' domain which will recieve all
>
> s/recieve/receive/
>
>> memory pages from a particular domain and kills the original domain.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> @@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>
> [..]
>
>> + else
>> + {
>> + mfn = page_to_mfn(pg);
>> + gmfn = mfn_to_gmfn(d, mfn);
>> +
>> + page_set_owner(pg, NULL);
>> + if ( assign_pages(d->recipient, pg, order, 0) )
>> + /* assign_pages reports the error by itself */
>> + goto out;
>> +
>> + if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )
>
> On ARM, mfn_to_gmfn will always return the mfn. This would result to
> add a 1:1 mapping in the recipient domain.
>
> But ... only DOM0 has its memory mapped 1:1. So this code may blow up
> the P2M of the recipient domain.
I know almost nothing about ARM so please bear with me. So, for a guest
domain the mapping is not 1:1 (so guest sees different addresses) but
mfn_to_gmfn() doesn't returs these addresses? I was under an impression
it is not x86-specific and I can see its usage in e.g. getdomaininfo(),
memory_exchange(),.. How can one figure out the mapping then?
Anyway, what I want to do here is: when this page is freed I want to
reassign it to our newly-created guest at the exact same address it was
mapped in the original domain.
>
> I'm not an x86 expert, but this may also happen when the recipient
> domain is using translated page mode (i.e HVM/PVHM).
PVHVM is the main target here (as kexec is unsupported for PV) and it
kinda works. mfn_to_gmfn() returns gmfn != mfn.
BTW, what's the current state of affairs with kexec and ARM guest? I
suppose we should have similar problems: vcpu_info, event channels,
..
>
> Regards,
--
Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-04 15:12 ` Vitaly Kuznetsov
@ 2014-12-04 15:54 ` Julien Grall
0 siblings, 0 replies; 28+ messages in thread
From: Julien Grall @ 2014-12-04 15:54 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On 04/12/14 15:12, Vitaly Kuznetsov wrote:
> Julien Grall <julien.grall@linaro.org> writes:
>
>> Hi Vitaly,
>>
>> On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
>>> New operation sets the 'recipient' domain which will recieve all
>>
>> s/recieve/receive/
>>
>>> memory pages from a particular domain and kills the original domain.
>>>
>>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> ---
>>> @@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>>
>> [..]
>>
>>> + else
>>> + {
>>> + mfn = page_to_mfn(pg);
>>> + gmfn = mfn_to_gmfn(d, mfn);
>>> +
>>> + page_set_owner(pg, NULL);
>>> + if ( assign_pages(d->recipient, pg, order, 0) )
>>> + /* assign_pages reports the error by itself */
>>> + goto out;
>>> +
>>> + if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )
>>
>> On ARM, mfn_to_gmfn will always return the mfn. This would result to
>> add a 1:1 mapping in the recipient domain.
>>
>> But ... only DOM0 has its memory mapped 1:1. So this code may blow up
>> the P2M of the recipient domain.
>
> I know almost nothing about ARM so please bear with me. So, for a guest
> domain the mapping is not 1:1 (so guest sees different addresses) but
> mfn_to_gmfn() doesn't returs these addresses?
Yes. I guess it's because this macro is not really used (see why below).
Ian, Stefano: any idea why mfn_to_gfmn is not correctly implemented on ARM?
> I was under an impression
> it is not x86-specific and I can see its usage in e.g. getdomaininfo(),
> memory_exchange(),..
I can find two places in the common code:
- getdomaininfo: The return is obviously buggy. Though, it doesn't seem
to be used in the toolstack side for ARM
- memory_exchange: AFAIK this is not supported right now.
> How can one figure out the mapping then?
AFAIK, there is no way on ARM to get a GMFN from an MFN in Xen.
Maybe we should implement it correctly mfn_to_gmfn on ARM? Ian, Stefano,
any though?
> Anyway, what I want to do here is: when this page is freed I want to
> reassign it to our newly-created guest at the exact same address it was
> mapped in the original domain.
> BTW, what's the current state of affairs with kexec and ARM guest?
AFAIK, nobody has worked on it for Xen ARM. I don't even know the status
for Kexec on ARM.
> I
> suppose we should have similar problems: vcpu_info, event channels,
> ..
ARM guest is very similar to PVH guest. All the problems you will fixed
now means less work when we will add support for ARM ;).
Regards,
--
Julien Grall
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-03 17:16 ` [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour Vitaly Kuznetsov
2014-12-04 0:50 ` Julien Grall
@ 2014-12-04 11:01 ` Julien Grall
2014-12-04 14:48 ` Vitaly Kuznetsov
1 sibling, 1 reply; 28+ messages in thread
From: Julien Grall @ 2014-12-04 11:01 UTC (permalink / raw)
To: Vitaly Kuznetsov, xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index a42d0b8..552e4a3 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -366,6 +366,8 @@ struct domain
> bool_t is_privileged;
> /* Which guest this guest has privileges on */
> struct domain *target;
> + /* Which guest receives freed memory pages */
It took me a while to understand that the recipient domain is a newly
created domain, right? It might be worth to add a word here (and maybe
in assign_pages).
With that in mind, the code in assign_pages makes more sense.
Regards,
--
Julien Grall
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour
2014-12-04 11:01 ` Julien Grall
@ 2014-12-04 14:48 ` Vitaly Kuznetsov
0 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-04 14:48 UTC (permalink / raw)
To: Julien Grall
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
Julien Grall <julien.grall@linaro.org> writes:
> On 03/12/2014 17:16, Vitaly Kuznetsov wrote:
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index a42d0b8..552e4a3 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -366,6 +366,8 @@ struct domain
>> bool_t is_privileged;
>> /* Which guest this guest has privileges on */
>> struct domain *target;
>> + /* Which guest receives freed memory pages */
>
> It took me a while to understand that the recipient domain is a newly
> created domain, right? It might be worth to add a word here (and maybe
> in assign_pages).
Sure, will add. In case you think renaming 'recipient' to something else
makes sense I'm all in.
>
> With that in mind, the code in assign_pages makes more sense.
>
> Regards,
--
Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v4 5/9] libxc: support XEN_DOMCTL_devour
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (3 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old() Vitaly Kuznetsov
` (6 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Introduce new xc_domain_devour() function to support XEN_DOMCTL_devour.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
tools/libxc/include/xenctrl.h | 14 ++++++++++++++
tools/libxc/xc_domain.c | 13 +++++++++++++
2 files changed, 27 insertions(+)
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 0ad8b8d..a789de3 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -558,6 +558,20 @@ int xc_domain_unpause(xc_interface *xch,
int xc_domain_destroy(xc_interface *xch,
uint32_t domid);
+/**
+ * This function sets a 'recipient' domain for a domain (when the source domain
+ * releases memory it is being reassigned to the recipient domain instead of
+ * being freed) and kills the original domain. The destination domain is supposed
+ * to have enough max_mem and no pages assigned.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid the source domain id
+ * @parm recipient the destrination domain id
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_devour(xc_interface *xch,
+ uint32_t domid, uint32_t recipient);
+
/**
* This function resumes a suspended domain. The domain should have
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index b864872..5949725 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -122,6 +122,19 @@ int xc_domain_destroy(xc_interface *xch,
return ret;
}
+int xc_domain_devour(xc_interface *xch, uint32_t domid, uint32_t recipient)
+{
+ int ret;
+ DECLARE_DOMCTL;
+ domctl.cmd = XEN_DOMCTL_devour;
+ domctl.domain = (domid_t)domid;
+ domctl.u.devour.recipient = (domid_t)recipient;
+ do {
+ ret = do_domctl(xch, &domctl);
+ } while ( ret && (errno == EAGAIN) );
+ return ret;
+}
+
int xc_domain_shutdown(xc_interface *xch,
uint32_t domid,
int reason)
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old()
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (4 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 5/9] libxc: support XEN_DOMCTL_devour Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-04 16:20 ` Wei Liu
2014-12-03 17:16 ` [PATCH v4 7/9] libxc: introduce soft reset for HVM domains Vitaly Kuznetsov
` (5 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
New libxl__domain_soft_reset_destroy_old() is an internal-only
version of libxl_domain_destroy() which follows the same domain
destroy path with the only difference: xc_domain_destroy() is
being avoided so the domain is not actually being destroyed.
Add soft_reset flag to libxl__domain_destroy_state structure
to support the change.
The original libxl_domain_destroy() function could be easily
modified to support new flag but I'm trying to avoid that as
it is part of public API.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
tools/libxl/libxl.c | 32 +++++++++++++++++++++++++++-----
tools/libxl/libxl_internal.h | 4 ++++
2 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f84f7c2..c2bd730 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1437,6 +1437,23 @@ int libxl_domain_destroy(libxl_ctx *ctx, uint32_t domid,
return AO_INPROGRESS;
}
+int libxl__domain_soft_reset_destroy_old(libxl_ctx *ctx, uint32_t domid,
+ const libxl_asyncop_how *ao_how)
+{
+ AO_CREATE(ctx, domid, ao_how);
+ libxl__domain_destroy_state *dds;
+
+ GCNEW(dds);
+ dds->ao = ao;
+ dds->domid = domid;
+ dds->callback = domain_destroy_cb;
+ dds->soft_reset = 1;
+ libxl__domain_destroy(egc, dds);
+
+ return AO_INPROGRESS;
+}
+
+
static void domain_destroy_cb(libxl__egc *egc, libxl__domain_destroy_state *dds,
int rc)
{
@@ -1612,6 +1629,7 @@ static void devices_destroy_cb(libxl__egc *egc,
{
STATE_AO_GC(drs->ao);
libxl__destroy_domid_state *dis = CONTAINER_OF(drs, *dis, drs);
+ libxl__domain_destroy_state *dds = CONTAINER_OF(dis, *dds, domain);
libxl_ctx *ctx = CTX;
uint32_t domid = dis->domid;
char *dom_path;
@@ -1650,11 +1668,15 @@ static void devices_destroy_cb(libxl__egc *egc,
}
libxl__userdata_destroyall(gc, domid);
- rc = xc_domain_destroy(ctx->xch, domid);
- if (rc < 0) {
- LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy failed for %d", domid);
- rc = ERROR_FAIL;
- goto out;
+ if (!dds->soft_reset)
+ {
+ rc = xc_domain_destroy(ctx->xch, domid);
+ if (rc < 0) {
+ LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
+ "xc_domain_destroy failed for %d", domid);
+ rc = ERROR_FAIL;
+ goto out;
+ }
}
rc = 0;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index a38f695..f29ed83 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2969,6 +2969,7 @@ struct libxl__domain_destroy_state {
int stubdom_finished;
libxl__destroy_domid_state domain;
int domain_finished;
+ int soft_reset;
};
/*
@@ -3132,6 +3133,9 @@ _hidden void libxl__domain_save_device_model(libxl__egc *egc,
_hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
+_hidden int libxl__domain_soft_reset_destroy_old(libxl_ctx *ctx, uint32_t domid,
+ const libxl_asyncop_how *ao_how);
+
/*
* Convenience macros.
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old()
2014-12-03 17:16 ` [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old() Vitaly Kuznetsov
@ 2014-12-04 16:20 ` Wei Liu
0 siblings, 0 replies; 28+ messages in thread
From: Wei Liu @ 2014-12-04 16:20 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On Wed, Dec 03, 2014 at 06:16:18PM +0100, Vitaly Kuznetsov wrote:
> New libxl__domain_soft_reset_destroy_old() is an internal-only
> version of libxl_domain_destroy() which follows the same domain
> destroy path with the only difference: xc_domain_destroy() is
> being avoided so the domain is not actually being destroyed.
>
> Add soft_reset flag to libxl__domain_destroy_state structure
> to support the change.
>
> The original libxl_domain_destroy() function could be easily
> modified to support new flag but I'm trying to avoid that as
> it is part of public API.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> tools/libxl/libxl.c | 32 +++++++++++++++++++++++++++-----
> tools/libxl/libxl_internal.h | 4 ++++
> 2 files changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index f84f7c2..c2bd730 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1437,6 +1437,23 @@ int libxl_domain_destroy(libxl_ctx *ctx, uint32_t domid,
> return AO_INPROGRESS;
> }
>
> +int libxl__domain_soft_reset_destroy_old(libxl_ctx *ctx, uint32_t domid,
> + const libxl_asyncop_how *ao_how)
> +{
Internal function takes gc, not ctx.
> + AO_CREATE(ctx, domid, ao_how);
If you want to use libxl context, use CTX macro.
> + libxl__domain_destroy_state *dds;
> +
> + GCNEW(dds);
> + dds->ao = ao;
> + dds->domid = domid;
> + dds->callback = domain_destroy_cb;
> + dds->soft_reset = 1;
> + libxl__domain_destroy(egc, dds);
> +
> + return AO_INPROGRESS;
> +}
> +
> +
> static void domain_destroy_cb(libxl__egc *egc, libxl__domain_destroy_state *dds,
> int rc)
> {
> @@ -1612,6 +1629,7 @@ static void devices_destroy_cb(libxl__egc *egc,
> {
> STATE_AO_GC(drs->ao);
> libxl__destroy_domid_state *dis = CONTAINER_OF(drs, *dis, drs);
> + libxl__domain_destroy_state *dds = CONTAINER_OF(dis, *dds, domain);
> libxl_ctx *ctx = CTX;
> uint32_t domid = dis->domid;
> char *dom_path;
> @@ -1650,11 +1668,15 @@ static void devices_destroy_cb(libxl__egc *egc,
> }
> libxl__userdata_destroyall(gc, domid);
>
> - rc = xc_domain_destroy(ctx->xch, domid);
> - if (rc < 0) {
> - LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy failed for %d", domid);
> - rc = ERROR_FAIL;
> - goto out;
> + if (!dds->soft_reset)
> + {
Coding style.
Wei.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v4 7/9] libxc: introduce soft reset for HVM domains
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (5 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old() Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-03 17:16 ` [PATCH v4 8/9] libxl: soft reset support Vitaly Kuznetsov
` (4 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Add new xc_domain_soft_reset() function which performs so-called 'soft reset'
for an HVM domain. It is being performed in the following way:
- Save HVM context and all HVM params;
- Devour original domain with XEN_DOMCTL_devour;
- Wait till original domain dies or has no pages left;
- Restore HVM context, HVM params, seed grant table.
After that the domain resumes execution from where SHUTDOWN_soft_reset was
called.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
tools/libxc/Makefile | 1 +
tools/libxc/include/xenguest.h | 20 +++
tools/libxc/xc_domain_soft_reset.c | 282 +++++++++++++++++++++++++++++++++++++
3 files changed, 303 insertions(+)
create mode 100644 tools/libxc/xc_domain_soft_reset.c
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index bd2ca6c..8f8abd6 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -52,6 +52,7 @@ GUEST_SRCS-y += xc_offline_page.c xc_compression.c
else
GUEST_SRCS-y += xc_nomigrate.c
endif
+GUEST_SRCS-y += xc_domain_soft_reset.c
vpath %.c ../../xen/common/libelf
CFLAGS += -I../../xen/common/libelf
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 40bbac8..770cd10 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -131,6 +131,26 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
* of the new domain is automatically appended to the filename,
* separated by a ".".
*/
+
+/**
+ * This function does soft reset for a domain. During soft reset all
+ * source domain's memory is being reassigned to the destination domain,
+ * HVM context and HVM params are being copied.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm source_dom the id of the source domain
+ * @parm dest_dom the id of the destination domain
+ * @parm console_domid the id of the domain handling console
+ * @parm console_mfn returned with the mfn of the console page
+ * @parm store_domid the id of the domain handling store
+ * @parm store_mfn returned with the mfn of the store page
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_soft_reset(xc_interface *xch, uint32_t source_dom,
+ uint32_t dest_dom, domid_t console_domid,
+ unsigned long *console_mfn, domid_t store_domid,
+ unsigned long *store_mfn);
+
#define XC_DEVICE_MODEL_RESTORE_FILE "/var/lib/xen/qemu-resume"
/**
diff --git a/tools/libxc/xc_domain_soft_reset.c b/tools/libxc/xc_domain_soft_reset.c
new file mode 100644
index 0000000..24d0b48
--- /dev/null
+++ b/tools/libxc/xc_domain_soft_reset.c
@@ -0,0 +1,282 @@
+/******************************************************************************
+ * xc_domain_soft_reset.c
+ *
+ * Do soft reset.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <inttypes.h>
+#include <time.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/time.h>
+
+#include "xc_private.h"
+#include "xc_core.h"
+#include "xc_bitops.h"
+#include "xc_dom.h"
+#include "xg_private.h"
+#include "xg_save_restore.h"
+
+#include <xen/hvm/params.h>
+
+#define SLEEP_INT 1
+
+int xc_domain_soft_reset(xc_interface *xch, uint32_t source_dom,
+ uint32_t dest_dom, domid_t console_domid,
+ unsigned long *console_mfn, domid_t store_domid,
+ unsigned long *store_mfn)
+{
+ xc_dominfo_t old_info, new_info;
+ int rc = 1;
+
+ uint32_t hvm_buf_size = 0;
+ uint8_t *hvm_buf = NULL;
+ unsigned long console_pfn, store_pfn, io_pfn, buffio_pfn;
+ unsigned long max_gpfn;
+ uint64_t hvm_params[HVM_NR_PARAMS];
+ xen_pfn_t sharedinfo_pfn;
+
+ DPRINTF("%s: soft reset domid %u -> %u", __func__, source_dom, dest_dom);
+
+ if ( xc_domain_getinfo(xch, source_dom, 1, &old_info) != 1 )
+ {
+ PERROR("Could not get old domain info");
+ return 1;
+ }
+
+ if ( xc_domain_getinfo(xch, dest_dom, 1, &new_info) != 1 )
+ {
+ PERROR("Could not get new domain info");
+ return 1;
+ }
+
+ if ( !old_info.hvm || !new_info.hvm )
+ {
+ PERROR("Soft reset is supported for HVM only");
+ return 1;
+ }
+
+ max_gpfn = xc_domain_maximum_gpfn(xch, source_dom);
+
+ sharedinfo_pfn = old_info.shared_info_frame;
+ if ( xc_get_pfn_type_batch(xch, source_dom, 1, &sharedinfo_pfn) )
+ {
+ PERROR("xc_get_pfn_type_batch failed");
+ goto out;
+ }
+
+ hvm_buf_size = xc_domain_hvm_getcontext(xch, source_dom, 0, 0);
+ if ( hvm_buf_size == -1 )
+ {
+ PERROR("Couldn't get HVM context size from Xen");
+ goto out;
+ }
+
+ hvm_buf = malloc(hvm_buf_size);
+ if ( !hvm_buf )
+ {
+ ERROR("Couldn't allocate memory");
+ goto out;
+ }
+
+ if ( xc_domain_hvm_getcontext(xch, source_dom, hvm_buf,
+ hvm_buf_size) == -1 )
+ {
+ PERROR("HVM:Could not get hvm buffer");
+ goto out;
+ }
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_STORE_PFN,
+ &hvm_params[HVM_PARAM_STORE_PFN]);
+ store_pfn = hvm_params[HVM_PARAM_STORE_PFN];
+ *store_mfn = store_pfn;
+
+ xc_hvm_param_get(xch, source_dom,
+ HVM_PARAM_CONSOLE_PFN,
+ &hvm_params[HVM_PARAM_CONSOLE_PFN]);
+ console_pfn = hvm_params[HVM_PARAM_CONSOLE_PFN];
+ *console_mfn = console_pfn;
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_BUFIOREQ_PFN,
+ &hvm_params[HVM_PARAM_BUFIOREQ_PFN]);
+ buffio_pfn = hvm_params[HVM_PARAM_BUFIOREQ_PFN];
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_IOREQ_PFN,
+ &hvm_params[HVM_PARAM_IOREQ_PFN]);
+ io_pfn = hvm_params[HVM_PARAM_IOREQ_PFN];
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_IDENT_PT,
+ &hvm_params[HVM_PARAM_IDENT_PT]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_PAGING_RING_PFN,
+ &hvm_params[HVM_PARAM_PAGING_RING_PFN]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_ACCESS_RING_PFN,
+ &hvm_params[HVM_PARAM_ACCESS_RING_PFN]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_VM86_TSS,
+ &hvm_params[HVM_PARAM_VM86_TSS]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_ACPI_IOPORTS_LOCATION,
+ &hvm_params[HVM_PARAM_ACPI_IOPORTS_LOCATION]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_VIRIDIAN,
+ &hvm_params[HVM_PARAM_VIRIDIAN]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_PAE_ENABLED,
+ &hvm_params[HVM_PARAM_PAE_ENABLED]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_STORE_EVTCHN,
+ &hvm_params[HVM_PARAM_STORE_EVTCHN]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_IOREQ_SERVER_PFN,
+ &hvm_params[HVM_PARAM_IOREQ_SERVER_PFN]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
+ &hvm_params[HVM_PARAM_NR_IOREQ_SERVER_PAGES]);
+
+ xc_hvm_param_get(xch, source_dom, HVM_PARAM_VM_GENERATION_ID_ADDR,
+ &hvm_params[HVM_PARAM_VM_GENERATION_ID_ADDR]);
+
+ rc = xc_domain_devour(xch, source_dom, dest_dom);
+ if ( rc != 0 )
+ {
+ PERROR("failed to devour original domain, rc=%d\n", rc);
+ goto out;
+ }
+
+ while ( 1 )
+ {
+ sleep(SLEEP_INT);
+ if ( xc_get_tot_pages(xch, source_dom) <= 0 )
+ {
+ DPRINTF("All pages were transferred");
+ break;
+ }
+ }
+
+
+ if ( sharedinfo_pfn == XEN_DOMCTL_PFINFO_XTAB)
+ {
+ /*
+ * Shared info frame is being removed when guest maps shared info so
+ * this page is likely XEN_DOMCTL_PFINFO_XTAB but we need to replace
+ * it with an empty page in that case.
+ */
+
+ if ( xc_domain_populate_physmap_exact(xch, dest_dom, 1, 0, 0,
+ &old_info.shared_info_frame) )
+ {
+ PERROR("failed to populate pfn %lx (shared info)", old_info.shared_info_frame);
+ goto out;
+ }
+ }
+
+ if ( xc_domain_hvm_setcontext(xch, dest_dom, hvm_buf,
+ hvm_buf_size) == -1 )
+ {
+ PERROR("HVM:Could not set hvm buffer");
+ goto out;
+ }
+
+ if ( store_pfn )
+ xc_clear_domain_page(xch, dest_dom, store_pfn);
+
+ if ( console_pfn )
+ xc_clear_domain_page(xch, dest_dom, console_pfn);
+
+ if ( buffio_pfn )
+ xc_clear_domain_page(xch, dest_dom, buffio_pfn);
+
+ if ( io_pfn )
+ xc_clear_domain_page(xch, dest_dom, io_pfn);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_STORE_PFN,
+ hvm_params[HVM_PARAM_STORE_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom,
+ HVM_PARAM_CONSOLE_PFN,
+ hvm_params[HVM_PARAM_CONSOLE_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_BUFIOREQ_PFN,
+ hvm_params[HVM_PARAM_BUFIOREQ_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_IOREQ_PFN,
+ hvm_params[HVM_PARAM_IOREQ_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_IDENT_PT,
+ hvm_params[HVM_PARAM_IDENT_PT]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_PAGING_RING_PFN,
+ hvm_params[HVM_PARAM_PAGING_RING_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_ACCESS_RING_PFN,
+ hvm_params[HVM_PARAM_ACCESS_RING_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_VM86_TSS,
+ hvm_params[HVM_PARAM_VM86_TSS]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_ACPI_IOPORTS_LOCATION,
+ hvm_params[HVM_PARAM_ACPI_IOPORTS_LOCATION]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_VIRIDIAN,
+ hvm_params[HVM_PARAM_VIRIDIAN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_PAE_ENABLED,
+ hvm_params[HVM_PARAM_PAE_ENABLED]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_STORE_EVTCHN,
+ hvm_params[HVM_PARAM_STORE_EVTCHN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_IOREQ_SERVER_PFN,
+ hvm_params[HVM_PARAM_IOREQ_SERVER_PFN]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
+ hvm_params[HVM_PARAM_NR_IOREQ_SERVER_PAGES]);
+
+ xc_hvm_param_set(xch, dest_dom, HVM_PARAM_VM_GENERATION_ID_ADDR,
+ hvm_params[HVM_PARAM_VM_GENERATION_ID_ADDR]);
+
+ if (xc_dom_gnttab_hvm_seed(xch, dest_dom, console_pfn, store_pfn,
+ console_domid, store_domid))
+ {
+ PERROR("error seeding hvm grant table");
+ goto out;
+ }
+
+ rc = 0;
+out:
+ if (hvm_buf) free(hvm_buf);
+
+ if ( (rc != 0) && (dest_dom != 0) ) {
+ PERROR("Faled to perform soft reset, destroying domain %d",
+ dest_dom);
+ xc_domain_destroy(xch, dest_dom);
+ }
+
+ return !!rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v4 8/9] libxl: soft reset support
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (6 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 7/9] libxc: introduce soft reset for HVM domains Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-04 16:01 ` Wei Liu
2014-12-03 17:16 ` [PATCH v4 9/9] xsm: add XEN_DOMCTL_devour support Vitaly Kuznetsov
` (3 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Perform soft reset when a domain did SHUTDOWN_soft_reset. Migrate the
content with xc_domain_soft_reset(), reload dm and toolstack.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
tools/libxl/libxl.h | 6 +++
tools/libxl/libxl_create.c | 103 +++++++++++++++++++++++++++++++++++++++----
tools/libxl/libxl_internal.h | 4 ++
tools/libxl/xl_cmdimpl.c | 22 ++++++++-
4 files changed, 124 insertions(+), 11 deletions(-)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 41d6e8d..c802635 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -919,6 +919,12 @@ int static inline libxl_domain_create_restore_0x040200(
#endif
+int libxl_domain_soft_reset(libxl_ctx *ctx, libxl_domain_config *d_config,
+ uint32_t *domid, uint32_t domid_old,
+ const libxl_asyncop_how *ao_how,
+ const libxl_asyncprogress_how *aop_console_how)
+ LIBXL_EXTERNAL_CALLERS_ONLY;
+
/* A progress report will be made via ao_console_how, of type
* domain_create_console_available, when the domain's primary
* console is available and can be connected to.
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 1198225..b1e809b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -25,6 +25,8 @@
#include <xen/hvm/hvm_info_table.h>
#include <xen/hvm/e820.h>
+#define INVALID_DOMID ~0
+
int libxl__domain_create_info_setdefault(libxl__gc *gc,
libxl_domain_create_info *c_info)
{
@@ -903,6 +905,9 @@ static void initiate_domain_create(libxl__egc *egc,
if (restore_fd >= 0) {
LOG(DEBUG, "restoring, not running bootloader");
domcreate_bootloader_done(egc, &dcs->bl, 0);
+ } else if (dcs->domid_soft_reset != INVALID_DOMID) {
+ LOG(DEBUG, "soft reset, not running bootloader\n");
+ domcreate_bootloader_done(egc, &dcs->bl, 0);
} else {
LOG(DEBUG, "running bootloader");
dcs->bl.callback = domcreate_bootloader_done;
@@ -951,6 +956,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
libxl_domain_config *const d_config = dcs->guest_config;
libxl_domain_build_info *const info = &d_config->b_info;
const int restore_fd = dcs->restore_fd;
+ const uint32_t domid_soft_reset = dcs->domid_soft_reset;
libxl__domain_build_state *const state = &dcs->build_state;
libxl__srm_restore_autogen_callbacks *const callbacks =
&dcs->shs.callbacks.restore.a;
@@ -974,7 +980,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
dcs->dmss.dm.callback = domcreate_devmodel_started;
dcs->dmss.callback = domcreate_devmodel_started;
- if ( restore_fd < 0 ) {
+ if ( (restore_fd < 0) && (domid_soft_reset == INVALID_DOMID) ) {
rc = libxl__domain_build(gc, d_config, domid, state);
domcreate_rebuild_done(egc, dcs, rc);
return;
@@ -1004,14 +1010,74 @@ static void domcreate_bootloader_done(libxl__egc *egc,
rc = ERROR_INVAL;
goto out;
}
- libxl__xc_domain_restore(egc, dcs,
- hvm, pae, superpages);
+ if ( restore_fd >= 0 ) {
+ libxl__xc_domain_restore(egc, dcs,
+ hvm, pae, superpages);
+ } else {
+ libxl__xc_domain_soft_reset(egc, dcs);
+ }
+
return;
out:
libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0);
}
+void libxl__xc_domain_soft_reset(libxl__egc *egc,
+ libxl__domain_create_state *dcs)
+{
+ STATE_AO_GC(dcs->ao);
+ libxl_ctx *ctx = libxl__gc_owner(gc);
+ const uint32_t domid_soft_reset = dcs->domid_soft_reset;
+ const uint32_t domid = dcs->guest_domid;
+ libxl_domain_config *const d_config = dcs->guest_config;
+ libxl_domain_build_info *const info = &d_config->b_info;
+ uint8_t *buf;
+ uint32_t len;
+ uint32_t console_domid, store_domid;
+ unsigned long store_mfn, console_mfn;
+ int rc;
+ struct libxl__domain_suspend_state *dss;
+
+ GCNEW(dss);
+
+ dss->ao = ao;
+ dss->domid = domid_soft_reset;
+ dss->dm_savefile = GCSPRINTF("/var/lib/xen/qemu-save.%d",
+ domid_soft_reset);
+
+ if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
+ rc = libxl__domain_suspend_device_model(gc, dss);
+ if (rc) goto out;
+ }
+
+ console_domid = dcs->build_state.console_domid;
+ store_domid = dcs->build_state.store_domid;
+
+ libxl__domain_soft_reset_destroy_old(ctx, domid_soft_reset, 0);
+
+ rc = xc_domain_soft_reset(ctx->xch, domid_soft_reset, domid, console_domid,
+ &console_mfn, store_domid, &store_mfn);
+ if (rc) goto out;
+
+ libxl__qmp_cleanup(gc, domid_soft_reset);
+
+ dcs->build_state.store_mfn = store_mfn;
+ dcs->build_state.console_mfn = console_mfn;
+
+ rc = libxl__toolstack_save(domid_soft_reset, &buf, &len, dss);
+ if (rc) goto out;
+
+ rc = libxl__toolstack_restore(domid, buf, len, &dcs->shs);
+ if (rc) goto out;
+out:
+ /*
+ * Now pretend we did normal restore and simply call
+ * libxl__xc_domain_restore_done().
+ */
+ libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0);
+}
+
void libxl__srm_callout_callback_restore_results(unsigned long store_mfn,
unsigned long console_mfn, void *user)
{
@@ -1037,6 +1103,7 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
/* convenience aliases */
const uint32_t domid = dcs->guest_domid;
+ const uint32_t domid_soft_reset = dcs->domid_soft_reset;
libxl_domain_config *const d_config = dcs->guest_config;
libxl_domain_build_info *const info = &d_config->b_info;
libxl__domain_build_state *const state = &dcs->build_state;
@@ -1089,9 +1156,12 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
if (ret)
goto out;
- if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
+ if (info->type == LIBXL_DOMAIN_TYPE_HVM && fd != -1) {
state->saved_state = GCSPRINTF(
XC_DEVICE_MODEL_RESTORE_FILE".%d", domid);
+ } else if (domid_soft_reset != INVALID_DOMID) {
+ state->saved_state = GCSPRINTF(
+ "/var/lib/xen/qemu-save.%d", domid_soft_reset);
}
out:
@@ -1100,9 +1170,12 @@ out:
libxl__file_reference_unmap(&state->pv_ramdisk);
}
- esave = errno;
- libxl_fd_set_nonblock(ctx, fd, 0);
- errno = esave;
+ if ( fd != -1 ) {
+ esave = errno;
+ libxl_fd_set_nonblock(ctx, fd, 0);
+ errno = esave;
+ }
+
domcreate_rebuild_done(egc, dcs, ret);
}
@@ -1495,6 +1568,7 @@ static void domain_create_cb(libxl__egc *egc,
static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
uint32_t *domid,
int restore_fd, int checkpointed_stream,
+ uint32_t domid_old,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
@@ -1507,6 +1581,7 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
cdcs->dcs.restore_fd = restore_fd;
+ cdcs->dcs.domid_soft_reset = domid_old;
cdcs->dcs.callback = domain_create_cb;
cdcs->dcs.checkpointed_stream = checkpointed_stream;
libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
@@ -1535,7 +1610,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
- return do_domain_create(ctx, d_config, domid, -1, 0,
+ return do_domain_create(ctx, d_config, domid, -1, 0, INVALID_DOMID,
ao_how, aop_console_how);
}
@@ -1546,7 +1621,17 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
const libxl_asyncprogress_how *aop_console_how)
{
return do_domain_create(ctx, d_config, domid, restore_fd,
- params->checkpointed_stream, ao_how, aop_console_how);
+ params->checkpointed_stream, INVALID_DOMID,
+ ao_how, aop_console_how);
+}
+
+int libxl_domain_soft_reset(libxl_ctx *ctx, libxl_domain_config *d_config,
+ uint32_t *domid, uint32_t domid_old,
+ const libxl_asyncop_how *ao_how,
+ const libxl_asyncprogress_how *aop_console_how)
+{
+ return do_domain_create(ctx, d_config, domid, -1, 0, domid_old,
+ ao_how, aop_console_how);
}
/*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index f29ed83..e40bba1 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3069,6 +3069,7 @@ struct libxl__domain_create_state {
libxl_domain_config *guest_config;
libxl_domain_config guest_config_saved; /* vanilla config */
int restore_fd;
+ uint32_t domid_soft_reset;
libxl__domain_create_cb *callback;
libxl_asyncprogress_how aop_console_how;
/* private to domain_create */
@@ -3123,6 +3124,9 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
* If rc!=0, retval and errnoval are undefined. */
_hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
int rc, int retval, int errnoval);
+/* calls libxl__xc_domain_restore_done when done */
+_hidden void libxl__xc_domain_soft_reset(libxl__egc *egc,
+ libxl__domain_create_state *dcs);
/* Each time the dm needs to be saved, we must call suspend and then save */
_hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 53611dc..eb833f0 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2043,7 +2043,8 @@ static void reload_domain_config(uint32_t domid,
}
/* Returns 1 if domain should be restarted,
- * 2 if domain should be renamed then restarted, or 0
+ * 2 if domain should be renamed then restarted,
+ * 3 if domain performed soft reset, or 0
* Can update r_domid if domain is destroyed etc */
static int handle_domain_death(uint32_t *r_domid,
libxl_event *event,
@@ -2069,6 +2070,9 @@ static int handle_domain_death(uint32_t *r_domid,
case LIBXL_SHUTDOWN_REASON_WATCHDOG:
action = d_config->on_watchdog;
break;
+ case LIBXL_SHUTDOWN_REASON_SOFT_RESET:
+ LOG("Domain performed soft reset.");
+ return 3;
default:
LOG("Unknown shutdown reason code %d. Destroying domain.",
event->u.domain_shutdown.shutdown_reason);
@@ -2285,6 +2289,7 @@ static void evdisable_disk_ejects(libxl_evgen_disk_eject **diskws,
static uint32_t create_domain(struct domain_create *dom_info)
{
uint32_t domid = INVALID_DOMID;
+ uint32_t domid_old = INVALID_DOMID;
libxl_domain_config d_config;
@@ -2510,7 +2515,18 @@ start:
* restore/migrate-receive it again.
*/
restoring = 0;
- }else{
+ } else if (domid_old != INVALID_DOMID) {
+ /* Do soft reset */
+ d_config.b_info.nodemap.size = 0;
+ ret = libxl_domain_soft_reset(ctx, &d_config,
+ &domid, domid_old,
+ 0, 0);
+
+ if ( ret ) {
+ goto error_out;
+ }
+ domid_old = INVALID_DOMID;
+ } else {
ret = libxl_domain_create_new(ctx, &d_config, &domid,
0, autoconnect_console_how);
}
@@ -2574,6 +2590,8 @@ start:
event->u.domain_shutdown.shutdown_reason,
event->u.domain_shutdown.shutdown_reason);
switch (handle_domain_death(&domid, event, &d_config)) {
+ case 3:
+ domid_old = domid;
case 2:
if (!preserve_domain(&domid, event, &d_config)) {
/* If we fail then exit leaving the old domain in place. */
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v4 8/9] libxl: soft reset support
2014-12-03 17:16 ` [PATCH v4 8/9] libxl: soft reset support Vitaly Kuznetsov
@ 2014-12-04 16:01 ` Wei Liu
2014-12-05 10:45 ` Vitaly Kuznetsov
0 siblings, 1 reply; 28+ messages in thread
From: Wei Liu @ 2014-12-04 16:01 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
(I've skipped the internal implementation since I don't know what's
required to fulfil soft reset.)
On Wed, Dec 03, 2014 at 06:16:20PM +0100, Vitaly Kuznetsov wrote:
[...]
> + libxl__domain_create_state *dcs);
>
> /* Each time the dm needs to be saved, we must call suspend and then save */
> _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 53611dc..eb833f0 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -2043,7 +2043,8 @@ static void reload_domain_config(uint32_t domid,
> }
>
> /* Returns 1 if domain should be restarted,
> - * 2 if domain should be renamed then restarted, or 0
> + * 2 if domain should be renamed then restarted,
> + * 3 if domain performed soft reset, or 0
> * Can update r_domid if domain is destroyed etc */
> static int handle_domain_death(uint32_t *r_domid,
> libxl_event *event,
> @@ -2069,6 +2070,9 @@ static int handle_domain_death(uint32_t *r_domid,
> case LIBXL_SHUTDOWN_REASON_WATCHDOG:
> action = d_config->on_watchdog;
> break;
> + case LIBXL_SHUTDOWN_REASON_SOFT_RESET:
> + LOG("Domain performed soft reset.");
> + return 3;
Would it be useful to provide "on_soft_reset" option in xl? Will the
admin be interested in performing some other action when domain does
soft reset? Say, for security reason admin want to prohibit domain from
soft resetting itself.
> default:
> LOG("Unknown shutdown reason code %d. Destroying domain.",
> event->u.domain_shutdown.shutdown_reason);
> @@ -2285,6 +2289,7 @@ static void evdisable_disk_ejects(libxl_evgen_disk_eject **diskws,
> static uint32_t create_domain(struct domain_create *dom_info)
> {
> uint32_t domid = INVALID_DOMID;
> + uint32_t domid_old = INVALID_DOMID;
>
> libxl_domain_config d_config;
>
> @@ -2510,7 +2515,18 @@ start:
> * restore/migrate-receive it again.
> */
> restoring = 0;
> - }else{
> + } else if (domid_old != INVALID_DOMID) {
> + /* Do soft reset */
> + d_config.b_info.nodemap.size = 0;
What's the reason for doing this?
If you encounter problem with this it should probably be fixed in libxl.
Wei.
> + ret = libxl_domain_soft_reset(ctx, &d_config,
> + &domid, domid_old,
> + 0, 0);
> +
> + if ( ret ) {
> + goto error_out;
> + }
> + domid_old = INVALID_DOMID;
> + } else {
> ret = libxl_domain_create_new(ctx, &d_config, &domid,
> 0, autoconnect_console_how);
> }
> @@ -2574,6 +2590,8 @@ start:
> event->u.domain_shutdown.shutdown_reason,
> event->u.domain_shutdown.shutdown_reason);
> switch (handle_domain_death(&domid, event, &d_config)) {
> + case 3:
> + domid_old = domid;
> case 2:
> if (!preserve_domain(&domid, event, &d_config)) {
> /* If we fail then exit leaving the old domain in place. */
> --
> 1.9.3
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 8/9] libxl: soft reset support
2014-12-04 16:01 ` Wei Liu
@ 2014-12-05 10:45 ` Vitaly Kuznetsov
0 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-05 10:45 UTC (permalink / raw)
To: Wei Liu
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
xen-devel
Wei Liu <wei.liu2@citrix.com> writes:
> (I've skipped the internal implementation since I don't know what's
> required to fulfil soft reset.)
>
> On Wed, Dec 03, 2014 at 06:16:20PM +0100, Vitaly Kuznetsov wrote:
> [...]
>> + libxl__domain_create_state *dcs);
>>
>> /* Each time the dm needs to be saved, we must call suspend and then save */
>> _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
>> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
>> index 53611dc..eb833f0 100644
>> --- a/tools/libxl/xl_cmdimpl.c
>> +++ b/tools/libxl/xl_cmdimpl.c
>> @@ -2043,7 +2043,8 @@ static void reload_domain_config(uint32_t domid,
>> }
>>
>> /* Returns 1 if domain should be restarted,
>> - * 2 if domain should be renamed then restarted, or 0
>> + * 2 if domain should be renamed then restarted,
>> + * 3 if domain performed soft reset, or 0
>> * Can update r_domid if domain is destroyed etc */
>> static int handle_domain_death(uint32_t *r_domid,
>> libxl_event *event,
>> @@ -2069,6 +2070,9 @@ static int handle_domain_death(uint32_t *r_domid,
>> case LIBXL_SHUTDOWN_REASON_WATCHDOG:
>> action = d_config->on_watchdog;
>> break;
>> + case LIBXL_SHUTDOWN_REASON_SOFT_RESET:
>> + LOG("Domain performed soft reset.");
>> + return 3;
>
> Would it be useful to provide "on_soft_reset" option in xl? Will the
> admin be interested in performing some other action when domain does
> soft reset? Say, for security reason admin want to prohibit domain from
> soft resetting itself.
>
Makes sense, let's add it.
>> default:
>> LOG("Unknown shutdown reason code %d. Destroying domain.",
>> event->u.domain_shutdown.shutdown_reason);
>> @@ -2285,6 +2289,7 @@ static void evdisable_disk_ejects(libxl_evgen_disk_eject **diskws,
>> static uint32_t create_domain(struct domain_create *dom_info)
>> {
>> uint32_t domid = INVALID_DOMID;
>> + uint32_t domid_old = INVALID_DOMID;
>>
>> libxl_domain_config d_config;
>>
>> @@ -2510,7 +2515,18 @@ start:
>> * restore/migrate-receive it again.
>> */
>> restoring = 0;
>> - }else{
>> + } else if (domid_old != INVALID_DOMID) {
>> + /* Do soft reset */
>> + d_config.b_info.nodemap.size = 0;
>
> What's the reason for doing this?
>
> If you encounter problem with this it should probably be fixed in
> libxl.
Ah, sorry, I forgot about this hackaround (which was required since
194e7183 if I'm not mistaken). The root cause is that
reload_domain_config() was missing on soft_reset path and we were
hitting "Can run NUMA placement only if the domain does not have any
NUMA node affinity set already" clause.
I will fix this along with "on_soft_reset" implementation.
>
> Wei.
>
>> + ret = libxl_domain_soft_reset(ctx, &d_config,
>> + &domid, domid_old,
>> + 0, 0);
>> +
>> + if ( ret ) {
>> + goto error_out;
>> + }
>> + domid_old = INVALID_DOMID;
>> + } else {
>> ret = libxl_domain_create_new(ctx, &d_config, &domid,
>> 0, autoconnect_console_how);
>> }
>> @@ -2574,6 +2590,8 @@ start:
>> event->u.domain_shutdown.shutdown_reason,
>> event->u.domain_shutdown.shutdown_reason);
>> switch (handle_domain_death(&domid, event, &d_config)) {
>> + case 3:
>> + domid_old = domid;
>> case 2:
>> if (!preserve_domain(&domid, event, &d_config)) {
>> /* If we fail then exit leaving the old domain in place. */
>> --
>> 1.9.3
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
--
Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v4 9/9] xsm: add XEN_DOMCTL_devour support
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (7 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 8/9] libxl: soft reset support Vitaly Kuznetsov
@ 2014-12-03 17:16 ` Vitaly Kuznetsov
2014-12-04 11:17 ` [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Olaf Hering
` (2 subsequent siblings)
11 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-03 17:16 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
Wei Liu
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
xen/common/domctl.c | 6 ++++++
xen/include/xsm/dummy.h | 6 ++++++
xen/include/xsm/xsm.h | 6 ++++++
xen/xsm/dummy.c | 1 +
xen/xsm/flask/hooks.c | 17 +++++++++++++++++
xen/xsm/flask/policy/access_vectors | 10 ++++++++++
6 files changed, 46 insertions(+)
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 7e7fb47..7c22e35 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1190,6 +1190,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
break;
}
+ ret = xsm_devour(XSM_HOOK, d, recipient_dom);
+ if ( ret ) {
+ put_domain(recipient_dom);
+ break;
+ }
+
if ( recipient_dom->tot_pages != 0 )
{
put_domain(recipient_dom);
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index f20e89c..6e9e38b 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -113,6 +113,12 @@ static XSM_INLINE int xsm_set_target(XSM_DEFAULT_ARG struct domain *d, struct do
return xsm_default_action(action, current->domain, NULL);
}
+static XSM_INLINE int xsm_devour(XSM_DEFAULT_ARG struct domain *d, struct domain *e)
+{
+ XSM_ASSERT_ACTION(XSM_HOOK);
+ return xsm_default_action(action, current->domain, NULL);
+}
+
static XSM_INLINE int xsm_domctl(XSM_DEFAULT_ARG struct domain *d, int cmd)
{
XSM_ASSERT_ACTION(XSM_OTHER);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 4ce089f..7db7433 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -58,6 +58,7 @@ struct xsm_operations {
int (*domctl_scheduler_op) (struct domain *d, int op);
int (*sysctl_scheduler_op) (int op);
int (*set_target) (struct domain *d, struct domain *e);
+ int (*devour) (struct domain *d, struct domain *e);
int (*domctl) (struct domain *d, int cmd);
int (*sysctl) (int cmd);
int (*readconsole) (uint32_t clear);
@@ -213,6 +214,11 @@ static inline int xsm_set_target (xsm_default_t def, struct domain *d, struct do
return xsm_ops->set_target(d, e);
}
+static inline int xsm_devour (xsm_default_t def, struct domain *d, struct domain *r)
+{
+ return xsm_ops->devour(d, r);
+}
+
static inline int xsm_domctl (xsm_default_t def, struct domain *d, int cmd)
{
return xsm_ops->domctl(d, cmd);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 8eb3050..f3c2f9e 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -35,6 +35,7 @@ void xsm_fixup_ops (struct xsm_operations *ops)
set_to_dummy_if_null(ops, domctl_scheduler_op);
set_to_dummy_if_null(ops, sysctl_scheduler_op);
set_to_dummy_if_null(ops, set_target);
+ set_to_dummy_if_null(ops, devour);
set_to_dummy_if_null(ops, domctl);
set_to_dummy_if_null(ops, sysctl);
set_to_dummy_if_null(ops, readconsole);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index d48463f..097c8c2 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -565,6 +565,21 @@ static int flask_set_target(struct domain *d, struct domain *t)
return rc;
}
+static int flask_devour(struct domain *d, struct domain *r)
+{
+ int rc;
+ struct domain_security_struct *dsec, *rsec;
+ dsec = d->ssid;
+ rsec = r->ssid;
+
+ rc = current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_AS_SOURCE);
+ if ( rc )
+ return rc;
+ if ( r )
+ rc = current_has_perm(r, SECCLASS_DOMAIN2, DOMAIN2__SET_AS_RECIPIENT);
+ return rc;
+}
+
static int flask_domctl(struct domain *d, int cmd)
{
switch ( cmd )
@@ -580,6 +595,7 @@ static int flask_domctl(struct domain *d, int cmd)
#ifdef HAS_MEM_ACCESS
case XEN_DOMCTL_mem_event_op:
#endif
+ case XEN_DOMCTL_devour:
#ifdef CONFIG_X86
/* These have individual XSM hooks (arch/x86/domctl.c) */
case XEN_DOMCTL_shadow_op:
@@ -1512,6 +1528,7 @@ static struct xsm_operations flask_ops = {
.domctl_scheduler_op = flask_domctl_scheduler_op,
.sysctl_scheduler_op = flask_sysctl_scheduler_op,
.set_target = flask_set_target,
+ .devour = flask_devour,
.domctl = flask_domctl,
.sysctl = flask_sysctl,
.readconsole = flask_readconsole,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 1da9f63..64c3424 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -142,6 +142,8 @@ class domain
# target = the new target domain
# see also the domain2 make_priv_for and set_as_target checks
set_target
+# XEN_DOMCTL_devour
+ devour
# SCHEDOP_remote_shutdown
shutdown
# XEN_DOMCTL_set{,_machine}_address_size
@@ -196,6 +198,14 @@ class domain2
# source = the domain making the hypercall
# target = the new target domain
set_as_target
+# checked in XEN_DOMCTL_devour:
+# source = the domain making the hypercall
+# target = the new source domain
+ set_as_source
+# checked in XEN_DOMCTL_devour:
+# source = the domain making the hypercall
+# target = the new recipient domain
+ set_as_recipient
# XEN_DOMCTL_set_cpuid
set_cpuid
# XEN_DOMCTL_gettscinfo
--
1.9.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (8 preceding siblings ...)
2014-12-03 17:16 ` [PATCH v4 9/9] xsm: add XEN_DOMCTL_devour support Vitaly Kuznetsov
@ 2014-12-04 11:17 ` Olaf Hering
2014-12-04 14:29 ` Vitaly Kuznetsov
2014-12-04 11:55 ` Wei Liu
2014-12-11 14:24 ` Olaf Hering
11 siblings, 1 reply; 28+ messages in thread
From: Olaf Hering @ 2014-12-04 11:17 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On Wed, Dec 03, Vitaly Kuznetsov wrote:
> Original description:
>
> When a PVHVM linux guest performs kexec there are lots of things which
> require taking care of:
> - shared info, vcpu_info
> - grants
> - event channels
> - ...
> Instead of taking care of all these things we can rebuild the domain
> performing kexec from scratch doing so-called soft-reboot.
>
> The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.
>
> P.S. The patch series can be tested with PVHVM Linux guest with the following
> modifications:
Its not clear to me how thew old kernel starts the new kernel.
How and where is that done?
Olaf
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-04 11:17 ` [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Olaf Hering
@ 2014-12-04 14:29 ` Vitaly Kuznetsov
0 siblings, 0 replies; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-04 14:29 UTC (permalink / raw)
To: Olaf Hering
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
Olaf Hering <olaf@aepfle.de> writes:
> On Wed, Dec 03, Vitaly Kuznetsov wrote:
>
>> Original description:
>>
>> When a PVHVM linux guest performs kexec there are lots of things which
>> require taking care of:
>> - shared info, vcpu_info
>> - grants
>> - event channels
>> - ...
>> Instead of taking care of all these things we can rebuild the domain
>> performing kexec from scratch doing so-called soft-reboot.
>>
>> The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.
>>
>> P.S. The patch series can be tested with PVHVM Linux guest with the following
>> modifications:
>
> Its not clear to me how thew old kernel starts the new kernel.
> How and where is that done?
It is done by linux kernel itself, I bring nothing new into the
picture. It all works like this:
1) Original guest does HYPERVISOR_sched_op(SCHEDOP_shutdown, r = { .reason =
SHUTDOWN_soft_reset})
2) All this rebuild machinery happens including copying HVM context
3) New guest resumes from where old did the hypercall
4) Kernel does kexec and new kernel is being booted.
>
> Olaf
--
Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (9 preceding siblings ...)
2014-12-04 11:17 ` [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Olaf Hering
@ 2014-12-04 11:55 ` Wei Liu
2014-12-04 14:46 ` Vitaly Kuznetsov
2014-12-11 14:24 ` Olaf Hering
11 siblings, 1 reply; 28+ messages in thread
From: Wei Liu @ 2014-12-04 11:55 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On Wed, Dec 03, 2014 at 06:16:12PM +0100, Vitaly Kuznetsov wrote:
> Changes from RFCv3:
> This is the first non-RFC series as no major concerns were expressed. I'm trying
> to address Jan's comments. Changes are:
> - Move from XEN_DOMCTL_set_recipient to XEN_DOMCTL_devour (I don't really like
> the name but nothing more appropriate came to my mind) which incorporates
> former XEN_DOMCTL_set_recipient and XEN_DOMCTL_destroydomain to prevent
> original domain from changing its allocations during transfer procedure.
> - Check in free_domheap_pages() that assign_pages() succeeded.
> - Change printk() in free_domheap_pages().
> - DOMDYING_locked state was introduced to support XEN_DOMCTL_devour.
> - xc_domain_soft_reset() got simplified a bit. Now we just wait for the original
> domain to die or loose all its pages.
> - rebased on top of current master branch.
>
> Changes from RFC/WIPv2:
>
> Here is a slightly different approach to memory reassignment. Instead of
> introducing new (and very doubtful) XENMEM_transfer operation introduce
> simple XEN_DOMCTL_set_recipient operation and do everything in free_domheap_pages()
> handler utilizing normal domain destroy path. This is better because:
> - The approach is general-enough
> - All memory pages are usually being freed when the domain is destroyed
> - No special grants handling required
> - Better supportability
>
> With regards to PV:
> Though XEN_DOMCTL_set_recipient works for both PV and HVM this patchset does not
> bring PV kexec/kdump support. xc_domain_soft_reset() is limited to work with HVM
> domains only. The main reason for that is: it is (in theory) possible to save p2m
> and rebuild them with the new domain but that would only allow us to resume execution
> from where we stopped. If we want to execute new kernel we need to build the same
> kernel/initrd/bootstrap_pagetables/... structure we build to boot PV domain initially.
> That however would destroy the original domain's memory thus making kdump impossible.
> To make everything work additional support from kexec userspace/linux kernel is
> required and I'm not sure it makes sense to implement all this stuff in the light of
> PVH.
>
What would happen if you soft reset a PV guest? At the very least soft
reset should be explicitly forbidden in PV case.
> Original description:
>
> When a PVHVM linux guest performs kexec there are lots of things which
> require taking care of:
> - shared info, vcpu_info
> - grants
> - event channels
> - ...
Is there a complete list?
> Instead of taking care of all these things we can rebuild the domain
> performing kexec from scratch doing so-called soft-reboot.
>
> The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.
>
As this approach requires toolstack do complex interaction with
hypervisor and preserve / throw away a bunch of states. I think the
whole procedure should be documented.
It would also be helpful if you link to previous discussions in your
cover letter.
Wei.
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-04 11:55 ` Wei Liu
@ 2014-12-04 14:46 ` Vitaly Kuznetsov
2014-12-04 15:08 ` Wei Liu
0 siblings, 1 reply; 28+ messages in thread
From: Vitaly Kuznetsov @ 2014-12-04 14:46 UTC (permalink / raw)
To: Wei Liu
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
xen-devel
Wei Liu <wei.liu2@citrix.com> writes:
> On Wed, Dec 03, 2014 at 06:16:12PM +0100, Vitaly Kuznetsov wrote:
>> Changes from RFCv3:
>> This is the first non-RFC series as no major concerns were expressed. I'm trying
>> to address Jan's comments. Changes are:
>> - Move from XEN_DOMCTL_set_recipient to XEN_DOMCTL_devour (I don't really like
>> the name but nothing more appropriate came to my mind) which incorporates
>> former XEN_DOMCTL_set_recipient and XEN_DOMCTL_destroydomain to prevent
>> original domain from changing its allocations during transfer procedure.
>> - Check in free_domheap_pages() that assign_pages() succeeded.
>> - Change printk() in free_domheap_pages().
>> - DOMDYING_locked state was introduced to support XEN_DOMCTL_devour.
>> - xc_domain_soft_reset() got simplified a bit. Now we just wait for the original
>> domain to die or loose all its pages.
>> - rebased on top of current master branch.
>>
>> Changes from RFC/WIPv2:
>>
>> Here is a slightly different approach to memory reassignment. Instead of
>> introducing new (and very doubtful) XENMEM_transfer operation introduce
>> simple XEN_DOMCTL_set_recipient operation and do everything in free_domheap_pages()
>> handler utilizing normal domain destroy path. This is better because:
>> - The approach is general-enough
>> - All memory pages are usually being freed when the domain is destroyed
>> - No special grants handling required
>> - Better supportability
>>
>> With regards to PV:
>> Though XEN_DOMCTL_set_recipient works for both PV and HVM this patchset does not
>> bring PV kexec/kdump support. xc_domain_soft_reset() is limited to work with HVM
>> domains only. The main reason for that is: it is (in theory) possible to save p2m
>> and rebuild them with the new domain but that would only allow us to resume execution
>> from where we stopped. If we want to execute new kernel we need to build the same
>> kernel/initrd/bootstrap_pagetables/... structure we build to boot PV domain initially.
>> That however would destroy the original domain's memory thus making kdump impossible.
>> To make everything work additional support from kexec userspace/linux kernel is
>> required and I'm not sure it makes sense to implement all this stuff in the light of
>> PVH.
>>
>
> What would happen if you soft reset a PV guest? At the very least soft
> reset should be explicitly forbidden in PV case.
Well, nothing particulary bad from hypervisor point of view
happens. But when PV guest dies page tables are being destroyed (and we
lose the knowledge anyway). It would be possible for the toolstack to
start from the beggining - build bootstrap tables, put kernel/initrd and
start everything. I however don't see any value in doing so: our memory
gets destroyed and it is going to be the same as plain reboot.
Why do we need to forbid the call for PV? (xc_domain_soft_reset()
already forbids PV).
>
>> Original description:
>>
>> When a PVHVM linux guest performs kexec there are lots of things which
>> require taking care of:
>> - shared info, vcpu_info
>> - grants
>> - event channels
>> - ...
>
> Is there a complete list?
>
Konrad tried to assemble it here:
http://lists.xen.org/archives/html/xen-devel/2014-06/msg03889.html
>> Instead of taking care of all these things we can rebuild the domain
>> performing kexec from scratch doing so-called soft-reboot.
>>
>> The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.
>>
>
> As this approach requires toolstack do complex interaction with
> hypervisor and preserve / throw away a bunch of states. I think the
> whole procedure should be documented.
Sure.. Where would you expect such doc to appear?
>
> It would also be helpful if you link to previous discussions in your
> cover letter.
Sure, will do next time. For now:
on resetting VCPU_info (and that's where 'rebuild everything with the
toolstack solution' was suggested):
http://lists.xen.org/archives/html/xen-devel/2014-08/msg01869.html
Previous versions:
http://lists.xen.org/archives/html/xen-devel/2014-08/msg01630.html
http://lists.xen.org/archives/html/xen-devel/2014-08/msg00603.html
EVTCHNOP_reset (got merged):
http://lists.xen.org/archives/html/xen-devel/2014-07/msg03979.html
Previous:
http://lists.xen.org/archives/html/xen-devel/2014-07/msg03925.html
http://lists.xen.org/archives/html/xen-devel/2014-07/msg03322.html
http://lists.xen.org/archives/html/xen-devel/2014-07/msg02500.html
This patch series:
http://lists.xen.org/archives/html/xen-devel/2014-10/msg00764.html
http://lists.xen.org/archives/html/xen-devel/2014-09/msg03623.html
http://lists.xen.org/archives/html/xen-devel/2014-08/msg02309.html
>
> Wei.
--
Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-04 14:46 ` Vitaly Kuznetsov
@ 2014-12-04 15:08 ` Wei Liu
0 siblings, 0 replies; 28+ messages in thread
From: Wei Liu @ 2014-12-04 15:08 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On Thu, Dec 04, 2014 at 03:46:59PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu2@citrix.com> writes:
>
> > On Wed, Dec 03, 2014 at 06:16:12PM +0100, Vitaly Kuznetsov wrote:
> >> Changes from RFCv3:
> >> This is the first non-RFC series as no major concerns were expressed. I'm trying
> >> to address Jan's comments. Changes are:
> >> - Move from XEN_DOMCTL_set_recipient to XEN_DOMCTL_devour (I don't really like
> >> the name but nothing more appropriate came to my mind) which incorporates
> >> former XEN_DOMCTL_set_recipient and XEN_DOMCTL_destroydomain to prevent
> >> original domain from changing its allocations during transfer procedure.
> >> - Check in free_domheap_pages() that assign_pages() succeeded.
> >> - Change printk() in free_domheap_pages().
> >> - DOMDYING_locked state was introduced to support XEN_DOMCTL_devour.
> >> - xc_domain_soft_reset() got simplified a bit. Now we just wait for the original
> >> domain to die or loose all its pages.
> >> - rebased on top of current master branch.
> >>
> >> Changes from RFC/WIPv2:
> >>
> >> Here is a slightly different approach to memory reassignment. Instead of
> >> introducing new (and very doubtful) XENMEM_transfer operation introduce
> >> simple XEN_DOMCTL_set_recipient operation and do everything in free_domheap_pages()
> >> handler utilizing normal domain destroy path. This is better because:
> >> - The approach is general-enough
> >> - All memory pages are usually being freed when the domain is destroyed
> >> - No special grants handling required
> >> - Better supportability
> >>
> >> With regards to PV:
> >> Though XEN_DOMCTL_set_recipient works for both PV and HVM this patchset does not
> >> bring PV kexec/kdump support. xc_domain_soft_reset() is limited to work with HVM
> >> domains only. The main reason for that is: it is (in theory) possible to save p2m
> >> and rebuild them with the new domain but that would only allow us to resume execution
> >> from where we stopped. If we want to execute new kernel we need to build the same
> >> kernel/initrd/bootstrap_pagetables/... structure we build to boot PV domain initially.
> >> That however would destroy the original domain's memory thus making kdump impossible.
> >> To make everything work additional support from kexec userspace/linux kernel is
> >> required and I'm not sure it makes sense to implement all this stuff in the light of
> >> PVH.
> >>
> >
> > What would happen if you soft reset a PV guest? At the very least soft
> > reset should be explicitly forbidden in PV case.
>
> Well, nothing particulary bad from hypervisor point of view
> happens. But when PV guest dies page tables are being destroyed (and we
> lose the knowledge anyway). It would be possible for the toolstack to
> start from the beggining - build bootstrap tables, put kernel/initrd and
> start everything. I however don't see any value in doing so: our memory
> gets destroyed and it is going to be the same as plain reboot.
>
OK. As long as hypervisor is safe I'm OK with it. I was thinking
from a security PoV, i.e. there is no guest triggerable crash of
hypervisor, guest cannot escape by providing arbitrary new kernel, etc.
> Why do we need to forbid the call for PV? (xc_domain_soft_reset()
> already forbids PV).
>
> >
> >> Original description:
> >>
> >> When a PVHVM linux guest performs kexec there are lots of things which
> >> require taking care of:
> >> - shared info, vcpu_info
> >> - grants
> >> - event channels
> >> - ...
> >
> > Is there a complete list?
> >
>
> Konrad tried to assemble it here:
> http://lists.xen.org/archives/html/xen-devel/2014-06/msg03889.html
>
>
> >> Instead of taking care of all these things we can rebuild the domain
> >> performing kexec from scratch doing so-called soft-reboot.
> >>
> >> The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.
> >>
> >
> > As this approach requires toolstack do complex interaction with
> > hypervisor and preserve / throw away a bunch of states. I think the
> > whole procedure should be documented.
>
> Sure.. Where would you expect such doc to appear?
>
I think libxl_internal.h is the right place. Just group the
documentation with other internal functions you introduced.
Re all the links, thanks, I will have a look.
Wei.
> >
> > It would also be helpful if you link to previous discussions in your
> > cover letter.
>
> Sure, will do next time. For now:
> on resetting VCPU_info (and that's where 'rebuild everything with the
> toolstack solution' was suggested):
> http://lists.xen.org/archives/html/xen-devel/2014-08/msg01869.html
> Previous versions:
> http://lists.xen.org/archives/html/xen-devel/2014-08/msg01630.html
> http://lists.xen.org/archives/html/xen-devel/2014-08/msg00603.html
>
> EVTCHNOP_reset (got merged):
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg03979.html
> Previous:
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg03925.html
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg03322.html
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg02500.html
>
> This patch series:
> http://lists.xen.org/archives/html/xen-devel/2014-10/msg00764.html
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg03623.html
> http://lists.xen.org/archives/html/xen-devel/2014-08/msg02309.html
>
> >
> > Wei.
>
> --
> Vitaly
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-03 17:16 [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec Vitaly Kuznetsov
` (10 preceding siblings ...)
2014-12-04 11:55 ` Wei Liu
@ 2014-12-11 14:24 ` Olaf Hering
2014-12-11 15:24 ` David Vrabel
11 siblings, 1 reply; 28+ messages in thread
From: Olaf Hering @ 2014-12-11 14:24 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Wei Liu, Andrew Jones, Keir Fraser, Ian Campbell,
Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
David Vrabel, Jan Beulich, xen-devel
On Wed, Dec 03, Vitaly Kuznetsov wrote:
> When a PVHVM linux guest performs kexec there are lots of things which
> require taking care of:
> - shared info, vcpu_info
> - grants
> - event channels
> - ...
> Instead of taking care of all these things we can rebuild the domain
> performing kexec from scratch doing so-called soft-reboot.
How does this approach handle ballooned pages?
>From the guests point of view they are always there, just claimed by the
balloon driver. The new kernel does not have that driver nor does its
driver have the knowledge which pages the old kernel gave back to Xen.
After a brief look none of the patches seem to deal with that.
Olaf
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-11 14:24 ` Olaf Hering
@ 2014-12-11 15:24 ` David Vrabel
2014-12-11 15:30 ` Olaf Hering
0 siblings, 1 reply; 28+ messages in thread
From: David Vrabel @ 2014-12-11 15:24 UTC (permalink / raw)
To: Olaf Hering, Vitaly Kuznetsov
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, David Vrabel, Jan Beulich,
xen-devel, Wei Liu
On 11/12/14 14:24, Olaf Hering wrote:
> On Wed, Dec 03, Vitaly Kuznetsov wrote:
>
>> When a PVHVM linux guest performs kexec there are lots of things which
>> require taking care of:
>> - shared info, vcpu_info
>> - grants
>> - event channels
>> - ...
>> Instead of taking care of all these things we can rebuild the domain
>> performing kexec from scratch doing so-called soft-reboot.
>
> How does this approach handle ballooned pages?
>>From the guests point of view they are always there, just claimed by the
> balloon driver. The new kernel does not have that driver nor does its
> driver have the knowledge which pages the old kernel gave back to Xen.
>
> After a brief look none of the patches seem to deal with that.
Nothing special needs to be done with ballooned pages. If frames are
not populated in the original domain, they will be unpopulated in the
new domain.
It's the responsibility of the guest to ensure it either doesn't kexec
when it is ballooned or that the kexec kernel can handle this (e.g., by
using a crash region that is never ballooned out).
David
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec
2014-12-11 15:24 ` David Vrabel
@ 2014-12-11 15:30 ` Olaf Hering
0 siblings, 0 replies; 28+ messages in thread
From: Olaf Hering @ 2014-12-11 15:30 UTC (permalink / raw)
To: David Vrabel
Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel,
Wei Liu, Vitaly Kuznetsov
On Thu, Dec 11, David Vrabel wrote:
> Nothing special needs to be done with ballooned pages. If frames are
> not populated in the original domain, they will be unpopulated in the
> new domain.
>
> It's the responsibility of the guest to ensure it either doesn't kexec
> when it is ballooned or that the kexec kernel can handle this (e.g., by
> using a crash region that is never ballooned out).
There is a difference between kexec and kdump. The kdump kernel does not
care because there is code in /proc/vmcore to handle ballooned pages in
the crashed kernel gracefully.
But a kexec boot will likely access pages which are not backed by RAM.
Unfortunately there is no flag left to mark a page as ballooned.
So what you are saying means that kexec-tools needs to continue to
balloon up before doing the actual kexec. I had hoped this suggested
approach would get rid of that limitation.
Olaf
^ permalink raw reply [flat|nested] 28+ messages in thread