All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: Paul Durrant <Paul.Durrant@citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Yang Hongyang <yanghy@cn.fujitsu.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"guijianfeng@cn.fujitsu.com" <guijianfeng@cn.fujitsu.com>,
	"yunhong.jiang@intel.com" <yunhong.jiang@intel.com>,
	Eddie Dong <eddie.dong@intel.com>,
	"rshriram@cs.ubc.ca" <rshriram@cs.ubc.ca>,
	Ian Jackson <Ian.Jackson@citrix.com>
Subject: Re: [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time
Date: Wed, 10 Jun 2015 19:37:44 +0800	[thread overview]
Message-ID: <55782188.8090306@cn.fujitsu.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD02594332D@AMSPEX01CL01.citrite.net>

On 06/10/2015 06:58 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Wen Congyang [mailto:wency@cn.fujitsu.com]
>> Sent: 10 June 2015 11:55
>> To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-devel@lists.xen.org
>> Cc: Wei Liu; Ian Campbell; yunhong.jiang@intel.com; Eddie Dong;
>> guijianfeng@cn.fujitsu.com; rshriram@cs.ubc.ca; Ian Jackson
>> Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq
>> page only one time
>>
>> On 06/10/2015 06:40 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Wen Congyang [mailto:wency@cn.fujitsu.com]
>>>> Sent: 10 June 2015 10:06
>>>> To: Andrew Cooper; Yang Hongyang; xen-devel@lists.xen.org; Paul
>> Durrant
>>>> Cc: Wei Liu; Ian Campbell; yunhong.jiang@intel.com; Eddie Dong;
>>>> guijianfeng@cn.fujitsu.com; rshriram@cs.ubc.ca; Ian Jackson
>>>> Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero
>> ioreq
>>>> page only one time
>>>>
>>>> Cc: Paul Durrant
>>>>
>>>> On 06/10/2015 03:44 PM, Andrew Cooper wrote:
>>>>> On 10/06/2015 06:26, Yang Hongyang wrote:
>>>>>>
>>>>>>
>>>>>> On 06/09/2015 03:30 PM, Andrew Cooper wrote:
>>>>>>> On 09/06/2015 01:59, Yang Hongyang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 06/08/2015 06:15 PM, Andrew Cooper wrote:
>>>>>>>>> On 08/06/15 10:58, Yang Hongyang wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 06/08/2015 05:46 PM, Andrew Cooper wrote:
>>>>>>>>>>> On 08/06/15 04:43, Yang Hongyang wrote:
>>>>>>>>>>>> ioreq page contains evtchn which will be set when we resume
>> the
>>>>>>>>>>>> secondary vm the first time. The hypervisor will check if the
>>>>>>>>>>>> evtchn is corrupted, so we cannot zero the ioreq page more
>>>>>>>>>>>> than one time.
>>>>>>>>>>>>
>>>>>>>>>>>> The ioreq->state is always STATE_IOREQ_NONE after the vm is
>>>>>>>>>>>> suspended, so it is OK if we only zero it one time.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>>>>>>>>>> Signed-off-by: Wen congyang <wency@cn.fujitsu.com>
>>>>>>>>>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>>>>>
>>>>>>>>>>> The issue here is that we are running the restore algorithm over
>> a
>>>>>>>>>>> domain which has already been running in Xen for a while.  This
>> is a
>>>>>>>>>>> brand new usecase, as far as I am aware.
>>>>>>>>>>
>>>>>>>>>> Exactly.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Does the qemu process associated with this domain get frozen
>>>>>>>>>>> while the
>>>>>>>>>>> secondary is being reset, or does the process get destroyed and
>>>>>>>>>>> recreated.
>>>>>>>>>>
>>>>>>>>>> What do you mean by reset? do you mean secondary is
>> suspended
>>>> at
>>>>>>>>>> checkpoint?
>>>>>>>>>
>>>>>>>>> Well - at the point that the buffered records are being processed,
>> we
>>>>>>>>> are in the process of resetting the state of the secondary to match
>>>>>>>>> the
>>>>>>>>> primary.
>>>>>>>>
>>>>>>>> Yes, at this point, the qemu process associated with this domain is
>>>>>>>> frozen.
>>>>>>>> the suspend callback will call libxl__qmp_stop(vm_stop() in qemu)
>> to
>>>>>>>> pause
>>>>>>>> qemu. After we processed all records, qemu will be restored with
>> the
>>>>>>>> received
>>>>>>>> state, that's why we add a
>> libxl__qmp_restore(qemu_load_vmstate()
>>>> in
>>>>>>>> qemu)
>>>>>>>> api to restore qemu with received state. Currently in libxl, qemu only
>>>>>>>> start
>>>>>>>> with the received state, there's no api to load received state while
>>>>>>>> qemu is
>>>>>>>> running for a while.
>>>>>>>
>>>>>>> Now I consider this more, it is absolutely wrong to not zero the page
>>>>>>> here.  The event channel in the page is not guaranteed to be the
>> same
>>>>>>> between the primary and secondary,
>>>>>>
>>>>>> That's why we don't zero it on secondary.
>>>>>
>>>>> I think you missed my point.  Apologies for the double negative.   It
>>>>> must, under all circumstances, be zeroed at this point, for safety
>> reasons.
>>>>>
>>>>> The page in question is subject to logdirty just like any other guest
>>>>> pages, which means that if the guest writes to it naturally (i.e. not a
>>>>> Xen or Qemu write, both of whom have magic mappings which are not
>>>>> subject to logdirty), it will be transmitted in the stream.  As the
>>>>> event channel could be different, the lack of zeroing it at this point
>>>>> means that the event channel would be wrong as opposed to simply
>>>>> missing.  This is a worse position to be in.
>>>>
>>>> The guest should not access this page. I am not sure if the guest can
>>>> access the ioreq page.
>>>>
>>>> But in the exceptional case, the ioreq page is dirtied, and is copied to
>>>> the secondary vm. The ioreq page will contain a wrong event channel, the
>>>> hypervisor will check it: if the event channel is wrong, the guest will
>>>> be crashed.
>>>>
>>>>>
>>>>>>
>>>>>>> and we don't want to unexpectedly
>>>>>>> find a pending/in-flight ioreq.
>>>>>>
>>>>>> ioreq->state is always STATE_IOREQ_NONE after the vm is suspended,
>>>> there
>>>>>> should be no pending/in-flight ioreq at checkpoint.
>>>>>
>>>>> In the common case perhaps, but we must consider the exceptional
>> case.
>>>>> The exceptional case here is some corruption which happens to appear
>> as
>>>>> an in-flight ioreq.
>>>>
>>>> If the state is STATE_IOREQ_NONE, it may be hypervisor's bug. If the
>>>> hypervisor
>>>> has a bug, anything can happen. I think we should trust the hypervisor.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Either qemu needs to take care of re-initialising the event channels
>>>>>>> back to appropriate values, or Xen should tolerate the channels
>>>>>>> disappearing.
>>>>>
>>>>> I still stand by this statement.  I believe it is the only safe way of
>>>>> solving the issue you have discovered.
>>>>
>>>> Add a new qemu monitor command to update ioreq page?
>>>>
>>>
>>> If you're attaching to a 'new' VM (i.e one with an updated image) then I
>> suspect you're going to have to destroy and re-create the ioreq server so
>> that the shared page gets re-populated with the correct event channels.
>> Either that or you're going to have to ensure that the page is not part of
>> restored image and sample the new one that Xen should have set up.
>>
>>
>> I agree with it. I will try to add a new qemu monitor command(or do it when
>> updating qemu's state) to destroy and re-create it.
> 
> The slightly tricky part of that is that you're going to have to cache and replay all the registrations that were done on the old instance, but you need to do that in any case as it's not state that is transferred in the VM save record.

Why do we have to cache and replay all the registrations that were done on the old instance?
We will set to the guest to a new state, the old state should be dropped.

Thanks
Wen Congyang

> 
>   Paul
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>>   Paul
>>>
>>>
>>>> Thanks
>>>> Wen Congyang
>>>>
>>>>>
>>>>> ~Andrew
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 

  reply	other threads:[~2015-06-10 11:37 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08  3:43 [PATCH v2 COLOPre 00/13] Prerequisite patches for COLO Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 01/13] libxc/restore: fix error handle of process_record Yang Hongyang
2015-06-08  9:24   ` Andrew Cooper
2015-06-08  9:37     ` Yang Hongyang
2015-06-08  9:39       ` Andrew Cooper
2015-06-10 14:55   ` Ian Campbell
2015-06-11  2:10     ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 02/13] tools/libxc: support to resume uncooperative HVM guests Yang Hongyang
2015-06-10 15:18   ` Ian Campbell
2015-06-11  2:42     ` Wen Congyang
2015-06-11  8:44       ` Ian Campbell
2015-06-11  8:56         ` Wen Congyang
2015-06-11  9:41           ` Ian Campbell
2015-06-08  3:43 ` [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time Yang Hongyang
2015-06-08  9:46   ` Andrew Cooper
2015-06-08  9:49     ` Andrew Cooper
2015-06-08  9:58     ` Yang Hongyang
2015-06-08 10:15       ` Andrew Cooper
2015-06-09  0:59         ` Yang Hongyang
2015-06-09  7:30           ` Andrew Cooper
2015-06-10  5:26             ` Yang Hongyang
2015-06-10  7:44               ` Andrew Cooper
2015-06-10  9:06                 ` Wen Congyang
2015-06-10 10:08                   ` Andrew Cooper
2015-06-10 10:35                     ` Paul Durrant
2015-06-10 10:40                   ` Paul Durrant
2015-06-10 10:54                     ` Wen Congyang
2015-06-10 10:58                       ` Paul Durrant
2015-06-10 11:37                         ` Wen Congyang [this message]
2015-06-10 11:47                           ` Paul Durrant
2015-06-11  1:13                             ` Wen Congyang
2015-06-11  8:32                               ` Paul Durrant
2015-06-11  8:48                                 ` Wen Congyang
2015-06-11 10:20                                   ` Paul Durrant
2015-06-11 11:14                                     ` Wen Congyang
2015-06-11 12:54                                       ` Yang Hongyang
2015-06-12  3:39                                         ` Yang Hongyang
2015-06-11 12:58                                     ` Yang Hongyang
2015-06-11 13:25                                       ` Paul Durrant
2015-06-12  3:22                                         ` Wen Congyang
2015-06-12  7:41                                           ` Paul Durrant
2015-06-12 10:26                                             ` Wen Congyang
2015-06-12 10:54                                               ` Paul Durrant
2015-06-12 11:09                                                 ` Wen Congyang
2015-06-12 11:48                                                   ` Paul Durrant
2015-06-12 15:04                                                     ` Wen Congyang
2015-06-12 15:31                                                       ` Paul Durrant
2015-06-13  5:58                                                         ` Wen Congyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h Yang Hongyang
2015-06-08 10:04   ` Yang Hongyang
2015-06-10 15:20   ` Ian Campbell
2015-06-11  2:07     ` Yang Hongyang
2015-06-11  8:41       ` Ian Campbell
2015-06-11 10:45         ` Andrew Cooper
2015-06-11 10:55           ` Ian Campbell
2015-06-15  1:50             ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 05/13] tools/libxl: introduce a new API libxl__domain_restore() to load qemu state Yang Hongyang
2015-06-10 15:35   ` Ian Campbell
2015-06-11  2:09     ` Yang Hongyang
2015-06-11  8:43       ` Ian Campbell
2015-06-11  8:55         ` Yang Hongyang
2015-06-11  9:41           ` Ian Campbell
2015-06-08  3:43 ` [PATCH v2 COLOPre 06/13] tools/libxl: Introduce a new internal API libxl__domain_unpause() Yang Hongyang
2015-06-10 15:37   ` Ian Campbell
2015-06-11  2:21     ` Yang Hongyang
2015-06-11  8:43       ` Ian Campbell
2015-06-11  9:09         ` Wen Congyang
2015-06-11  9:42           ` Ian Campbell
2015-06-11  9:48             ` Wen Congyang
2015-06-12 11:23             ` Ian Jackson
2015-06-08  3:43 ` [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen Yang Hongyang
2015-06-12 12:33   ` Wei Liu
2015-06-15  1:29     ` Yang Hongyang
2015-06-15 16:22       ` Wei Liu
2015-06-17  9:02         ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 08/13] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Yang Hongyang
2015-06-16 10:45   ` Ian Campbell
2015-06-08  3:43 ` [PATCH v2 COLOPre 09/13] tools/libxl: Update libxl_save_msgs_gen.pl to support return data from xl to xc Yang Hongyang
2015-06-16 10:49   ` Ian Campbell
2015-06-16 10:54     ` Wen Congyang
2015-06-16 10:56       ` Ian Jackson
2015-06-16 11:01     ` Ian Jackson
2015-06-16 11:05   ` Ian Jackson
2015-06-16 14:19     ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 10/13] tools/libxl: Add back channel to allow migration target send data back Yang Hongyang
2015-06-12 12:54   ` Wei Liu
2015-06-12 15:04     ` Ian Jackson
2015-06-15  1:38       ` Yang Hongyang
2015-06-16 10:52         ` Ian Campbell
2015-06-16 10:58           ` Ian Jackson
2015-06-15  1:33     ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 11/13] tools/libxl: rename remus device to checkpoint device Yang Hongyang
2015-06-12 13:30   ` Wei Liu
2015-06-12 13:35     ` Wei Liu
2015-06-12 14:57       ` Ian Jackson
2015-06-15  1:45         ` Yang Hongyang
2015-06-15 16:24           ` Wei Liu
2015-06-16 10:53             ` Ian Campbell
2015-06-25  5:00               ` Yang Hongyang
2015-06-25  9:09                 ` Wei Liu
2015-06-25  9:16                   ` Yang Hongyang
2015-06-08  3:43 ` [PATCH v2 COLOPre 12/13] tools/libxl: adjust the indentation Yang Hongyang
2015-06-16 10:53   ` Ian Campbell
2015-06-08  3:43 ` [PATCH v2 COLOPre 13/13] tools/libxl: don't touch remus in checkpoint_device Yang Hongyang
2015-06-12 13:28   ` Wei Liu
2015-06-15  1:46     ` Yang Hongyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55782188.8090306@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.