From mboxrd@z Thu Jan  1 00:00:00 1970
From: Yang Hongyang <yanghy@cn.fujitsu.com>
Subject: Re: [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq
 page only one time
Date: Tue, 9 Jun 2015 08:59:59 +0800
Message-ID: <55763A8F.6040608@cn.fujitsu.com>
References: <1433734997-26570-1-git-send-email-yanghy@cn.fujitsu.com>
	<1433734997-26570-4-git-send-email-yanghy@cn.fujitsu.com>
	<55756468.4090500@citrix.com> <55756757.7020900@cn.fujitsu.com>
	<55756B45.8020708@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <55756B45.8020708@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Andrew Cooper <andrew.cooper3@citrix.com>, xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, ian.campbell@citrix.com, wency@cn.fujitsu.com, guijianfeng@cn.fujitsu.com, yunhong.jiang@intel.com, eddie.dong@intel.com, rshriram@cs.ubc.ca, ian.jackson@eu.citrix.com
List-Id: xen-devel@lists.xenproject.org


On 06/08/2015 06:15 PM, Andrew Cooper wrote:
> On 08/06/15 10:58, Yang Hongyang wrote:
>>
>>
>> On 06/08/2015 05:46 PM, Andrew Cooper wrote:
>>> On 08/06/15 04:43, Yang Hongyang wrote:
>>>> ioreq page contains evtchn which will be set when we resume the
>>>> secondary vm the first time. The hypervisor will check if the
>>>> evtchn is corrupted, so we cannot zero the ioreq page more
>>>> than one time.
>>>>
>>>> The ioreq->state is always STATE_IOREQ_NONE after the vm is
>>>> suspended, so it is OK if we only zero it one time.
>>>>
>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>> Signed-off-by: Wen congyang <wency@cn.fujitsu.com>
>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> The issue here is that we are running the restore algorithm over a
>>> domain which has already been running in Xen for a while.  This is a
>>> brand new usecase, as far as I am aware.
>>
>> Exactly.
>>
>>>
>>> Does the qemu process associated with this domain get frozen while the
>>> secondary is being reset, or does the process get destroyed and
>>> recreated.
>>
>> What do you mean by reset? do you mean secondary is suspended at
>> checkpoint?
>
> Well - at the point that the buffered records are being processed, we
> are in the process of resetting the state of the secondary to match the
> primary.

Yes, at this point, the qemu process associated with this domain is frozen.
the suspend callback will call libxl__qmp_stop(vm_stop() in qemu) to pause
qemu. After we processed all records, qemu will be restored with the received
state, that's why we add a libxl__qmp_restore(qemu_load_vmstate() in qemu)
api to restore qemu with received state. Currently in libxl, qemu only start
with the received state, there's no api to load received state while qemu is
running for a while.

>
> ~Andrew
>
>>
>>>
>>> I have a gut feeling that it would be safer to clear all of the page
>>> other than the event channel, but that depends on exactly what else is
>>> going on.  We absolutely don't want to do is have an update to this page
>>> from the primary with an in-progress IOREQ.
>>>
>>> ~Andrew
>>>
>>>> ---
>>>>    tools/libxc/xc_sr_restore_x86_hvm.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c
>>>> b/tools/libxc/xc_sr_restore_x86_hvm.c
>>>> index 6f5af0e..06177e0 100644
>>>> --- a/tools/libxc/xc_sr_restore_x86_hvm.c
>>>> +++ b/tools/libxc/xc_sr_restore_x86_hvm.c
>>>> @@ -78,7 +78,8 @@ static int handle_hvm_params(struct xc_sr_context
>>>> *ctx,
>>>>                break;
>>>>            case HVM_PARAM_IOREQ_PFN:
>>>>            case HVM_PARAM_BUFIOREQ_PFN:
>>>> -            xc_clear_domain_page(xch, ctx->domid, entry->value);
>>>> +            if ( !ctx->restore.buffer_all_records )
>>>> +                xc_clear_domain_page(xch, ctx->domid, entry->value);
>>>>                break;
>>>>            }
>>>>
>>>
>>> .
>>>
>>
>
> .
>

-- 
Thanks,
Yang.