xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Lars Kurth <lars.kurth@citrix.com>,
	Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	xen devel <xen-devel@lists.xen.org>,
	Dong Eddie <eddie.dong@intel.com>,
	Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Shriram Rajagopalan <rshriram@cs.ubc.ca>,
	Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
Date: Fri, 19 Feb 2016 16:42:20 +0000	[thread overview]
Message-ID: <20160219164220.GW3723@citrix.com> (raw)
In-Reply-To: <20160219162007.GD31685@char.us.oracle.com>

On Fri, Feb 19, 2016 at 11:20:08AM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> > > > > > ===
> > > > > > 
> > > > > > Note that I fix one place in this version from "guest state" to
> > > > > > "guest
> > > > > > return code" in the second paragraph. And that sentence is a big big
> > > > > > assumption that I don't know whether it is true or not --
> > > > > > reverse-engineer from comment before xc_domain_resume and what Linux
> > > > > > does.
> > > > > > 
> > > > > > But the more I think the more I'm not sure if I'm writing the right
> > > > > > thing. I also can't judge what is the right behaviour on the Linux
> > > > > > side.
> > > > > > 
> > > > > > Konrad, can you fact-check the commit message a bit? And maybe you
> > > > > > can
> > > > > > help answer the following questions?
> > > > > > 
> > > > > > 1. If we use fast=0 on PVHVM guest, will it work?
> > > > > 
> > > > > Yes.
> > > > > > 2. If we use fast=0 on HVM guest, will it work?
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > 
> > > > > > What is worse, when I say "work" I actually have no clear definition
> > > > > > of
> > > > > > it. There doesn't seem to be a defined state that the guest needs to
> > > > > > be.
> > > > > 
> > > > > For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> > > > > to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> > > > > completed (so Xen has suspended the guest then later resumed it), it
> > > > > would be the guest responsibility to setup Xen infrastructure. As in
> > > > > retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> > > > > 
> > > > > For HVM guests, fast = 0, suspends the guests without the guest making
> > > > > any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> > > > > Afterwards the guest is resumed and continues as usual. No PV drivers -
> > > > > hence no need to re-establish Xen PV infrastructure.
> > > > > 
> > > > 
> > > > Wait, isn't this function about resuming a guest? I'm confused because
> > > > you talk about HV injecting S3 suspend. I guess you wrote the wrong
> > > > thing?
> 
> I was writing the whole chain - suspend, and then resume. This patch is
> about resume - but to get to resume you need to suspend first.
> 

Yes, of course. I was thinking more about writing it down as comment for
xc_domain_resume, so I wrote something from the perspective of resuming.

If you don't disagree with my extrapolation in previous email we don't
need to quibble about the wording anymore.

> > > > 
> > > > My guess is below, from the perspective of resuming a guest
> > > > 
> > > >   PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
> > > >   suspend. So when toolstack uses fast=0, the guest resumes from the
> > > >   hypercall with return code unmodified. Guest then re-setup Xen
> > > >   infrastructure.
> > > 
> > > Who or what has torn down the existing infrastructure from the guest's life
> > > before the suspend in this case? AFAI Remember a guest expects to return
> 
> The guest. Or it can ignore it and and just re-init all its settings.
> 
> > > from SCHEDOP_shutdown(SHUTDOWN_suspend) with return code == 0 in a freshly
> > > minted new domain, but in the resume case it is actually resuming in the
> > > original domain, complete with any evtchn's and grant tables mappings etc
> > > still intact from before it slept.
> > > 
> > > Perhaps I'm misremembering and the guest is expected to deal with the
> > > possibility of resources already being in place when it re-sets up the
> > > infra?
> 
> Correct - albeit all of them are stale. Thought on some off-chance they may
> be set correctly.
> 
> > > 
> > 
> > Sigh, this is that sort of things that get to my nerves. I should try to
> > write something down when we come to a conclusion.  I would be happy to
> > have any definite answer to the expected behaviour of guest.
> > Extrapolation is not very helpful in the face of some many different
> > versions of Linux'es and BSDs.
> > 
> > But, if the confusion is only about PVHVM guest with fast=0, we can
> > forbid that specific combination for now. That should be enough to move
> > COLO forward.
> 
> .. forbid what? PVHVM resuming with fast=0? Why?  Because the guest may
> fall on its face?

Yes, forbid resuming PVHVM with fast=0 if we have no clear definition of
how it works. It's not because guest would fall, it's because we can't
tell which side (the guest or the toolstack) is buggy when the guest
falls.

But it looks like we (you ;-) ) have clear idea of how it works, we
(you) just need to write it down.

Wei.

> > 
> > Wei.
> > 
> > > Ian.
> > > 

  reply	other threads:[~2016-02-19 16:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
2016-02-18 12:30   ` Wei Liu
2016-02-18  2:43 ` [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2016-02-18  2:43 ` [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2016-02-18  2:43 ` [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-02-18  2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-02-18 12:13   ` Wei Liu
2016-02-19 14:15     ` Konrad Rzeszutek Wilk
2016-02-19 14:43       ` Wei Liu
2016-02-19 14:52         ` Ian Campbell
2016-02-19 15:16           ` Wei Liu
2016-02-19 16:20             ` Konrad Rzeszutek Wilk
2016-02-19 16:42               ` Wei Liu [this message]
2016-02-19 17:16                 ` Konrad Rzeszutek Wilk
2016-02-19 17:21                   ` Wei Liu
2016-02-18  2:43 ` [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-02-18  2:43 ` [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2016-02-18  2:43 ` [PATCH v8 08/13] tools/libxl: export logdirty_init Wen Congyang
2016-02-18  2:43 ` [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-02-18  2:43 ` [PATCH v8 10/13] tools/libxl: adjust the indentation Wen Congyang
2016-02-18  2:43 ` [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-02-18  2:43 ` [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-02-18  2:43 ` [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
2016-02-26 18:16   ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160219164220.GW3723@citrix.com \
    --to=wei.liu2@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=hongyang.yang@easystack.cn \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=lars.kurth@citrix.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wency@cn.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).