From: Wei Liu <wei.liu2@citrix.com>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lars Kurth <lars.kurth@citrix.com>,
Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
Wei Liu <wei.liu2@citrix.com>,
Ian Campbell <ian.campbell@citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Jiang Yunhong <yunhong.jiang@intel.com>,
Ian Jackson <ian.jackson@eu.citrix.com>,
xen devel <xen-devel@lists.xen.org>,
Dong Eddie <eddie.dong@intel.com>,
Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
Shriram Rajagopalan <rshriram@cs.ubc.ca>,
Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
Date: Thu, 18 Feb 2016 12:13:36 +0000 [thread overview]
Message-ID: <20160218121336.GG3723@citrix.com> (raw)
In-Reply-To: <1455763403-18641-6-git-send-email-wency@cn.fujitsu.com>
On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> Before this patch:
> 1. suspend
> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
> request to the guest). If the guest doesn't support evtchn, the xenstore
> variant will be used, suspending the guest via XenBus control node.
> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
> the guest
>
> 2. Resume:
> a. fast path(fast=1)
> Do not change the guest state. We call libxl__domain_resume(.., 1) which
> calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> PV: modify the return code to 1, and than call the domctl:
> XEN_DOMCTL_resumedomain
> PVHVM: same with PV
> pure HVM: do nothing in modify_returncode, and than call the domctl:
> XEN_DOMCTL_resumedomain
> b. slow
> Used when the guest's state have been changed. Will call
> libxl__domain_resume(..., 0) to resume the guest.
> PV: update start info, and reset all secondary CPU states. Than call
> the domctl: XEN_DOMCTL_resumedomain
> PVHVM: can not be resumed. You will get the following error message:
> "Cannot resume uncooperative HVM guests"
> pure HVM: same with PVHVM
>
> After this patch:
> 1. suspend
> unchanged
>
> 2. Resume
> a. fast path:
> unchanged
> b. slow
> PV: unchanged
> PVHVM: call XEN_DOMCTL_resumedomain to resume the guest. Because we
> don't modify the return code, the PV driver will disconnect
> and reconnect.
> The guest ends up doing the XENMAPSPACE_shared_info
> XENMEM_add_to_physmap hypercall and resetting all of its CPU
> states to point to the shared_info(well except the ones past 32).
> That is the Linux kernel does that - regardless whether the
> SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
>
> Under COLO, we will update the guest's state(modify memory, cpu's registers,
> device status...). In this case, we cannot use the fast path to resume it.
> Keep the return code 0, and use a slow path to resume the guest. While
> resuming HVM using slow path is not supported currently, this patch is to
> make the resume call to not fail.
>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
I proposed an alternative commit log in a previous reply:
===
Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
Previously it was not possible to resume PVHVM or pure HVM guest in slow
path because libxc didn't support that.
Using XEN_DOMCTL_resumedomain without modifying guest return code to resume a
guest is considered to be always safe. Introduce a function to do that for
(PV)HVM guests in slow path resume.
This patch fixes a bug that denies (PV)HVM slow path resume. This will
enable COLO to work properly: COLO requires HVM guest to start in the
new context that has been set up by COLO, hence slow path resume is
required.
===
Note that I fix one place in this version from "guest state" to "guest
return code" in the second paragraph. And that sentence is a big big
assumption that I don't know whether it is true or not --
reverse-engineer from comment before xc_domain_resume and what Linux
does.
But the more I think the more I'm not sure if I'm writing the right
thing. I also can't judge what is the right behaviour on the Linux side.
Konrad, can you fact-check the commit message a bit? And maybe you can
help answer the following questions?
1. If we use fast=0 on PVHVM guest, will it work?
2. If we use fast=0 on HVM guest, will it work?
What is worse, when I say "work" I actually have no clear definition of
it. There doesn't seem to be a defined state that the guest needs to be.
Wei.
next prev parent reply other threads:[~2016-02-18 12:13 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-18 2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
2016-02-18 2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
2016-02-18 12:30 ` Wei Liu
2016-02-18 2:43 ` [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2016-02-18 2:43 ` [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2016-02-18 2:43 ` [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-02-18 2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-02-18 12:13 ` Wei Liu [this message]
2016-02-19 14:15 ` Konrad Rzeszutek Wilk
2016-02-19 14:43 ` Wei Liu
2016-02-19 14:52 ` Ian Campbell
2016-02-19 15:16 ` Wei Liu
2016-02-19 16:20 ` Konrad Rzeszutek Wilk
2016-02-19 16:42 ` Wei Liu
2016-02-19 17:16 ` Konrad Rzeszutek Wilk
2016-02-19 17:21 ` Wei Liu
2016-02-18 2:43 ` [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-02-18 2:43 ` [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2016-02-18 2:43 ` [PATCH v8 08/13] tools/libxl: export logdirty_init Wen Congyang
2016-02-18 2:43 ` [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-02-18 2:43 ` [PATCH v8 10/13] tools/libxl: adjust the indentation Wen Congyang
2016-02-18 2:43 ` [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-02-18 2:43 ` [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-02-18 2:43 ` [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
2016-02-26 18:16 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160218121336.GG3723@citrix.com \
--to=wei.liu2@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=hongyang.yang@easystack.cn \
--cc=ian.campbell@citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=lars.kurth@citrix.com \
--cc=rshriram@cs.ubc.ca \
--cc=wency@cn.fujitsu.com \
--cc=xen-devel@lists.xen.org \
--cc=xiecl.fnst@cn.fujitsu.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).