From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq
 page only one time
Date: Mon, 8 Jun 2015 11:15:33 +0100
Message-ID: <55756B45.8020708@citrix.com>
References: <1433734997-26570-1-git-send-email-yanghy@cn.fujitsu.com>
	<1433734997-26570-4-git-send-email-yanghy@cn.fujitsu.com>
	<55756468.4090500@citrix.com> <55756757.7020900@cn.fujitsu.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <55756757.7020900@cn.fujitsu.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Yang Hongyang <yanghy@cn.fujitsu.com>, xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, ian.campbell@citrix.com, wency@cn.fujitsu.com, guijianfeng@cn.fujitsu.com, yunhong.jiang@intel.com, eddie.dong@intel.com, rshriram@cs.ubc.ca, ian.jackson@eu.citrix.com
List-Id: xen-devel@lists.xenproject.org

On 08/06/15 10:58, Yang Hongyang wrote:
>
>
> On 06/08/2015 05:46 PM, Andrew Cooper wrote:
>> On 08/06/15 04:43, Yang Hongyang wrote:
>>> ioreq page contains evtchn which will be set when we resume the
>>> secondary vm the first time. The hypervisor will check if the
>>> evtchn is corrupted, so we cannot zero the ioreq page more
>>> than one time.
>>>
>>> The ioreq->state is always STATE_IOREQ_NONE after the vm is
>>> suspended, so it is OK if we only zero it one time.
>>>
>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>> Signed-off-by: Wen congyang <wency@cn.fujitsu.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> The issue here is that we are running the restore algorithm over a
>> domain which has already been running in Xen for a while.  This is a
>> brand new usecase, as far as I am aware.
>
> Exactly.
>
>>
>> Does the qemu process associated with this domain get frozen while the
>> secondary is being reset, or does the process get destroyed and
>> recreated.
>
> What do you mean by reset? do you mean secondary is suspended at
> checkpoint?

Well - at the point that the buffered records are being processed, we
are in the process of resetting the state of the secondary to match the
primary.

~Andrew

>
>>
>> I have a gut feeling that it would be safer to clear all of the page
>> other than the event channel, but that depends on exactly what else is
>> going on.  We absolutely don't want to do is have an update to this page
>> from the primary with an in-progress IOREQ.
>>
>> ~Andrew
>>
>>> ---
>>>   tools/libxc/xc_sr_restore_x86_hvm.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c
>>> b/tools/libxc/xc_sr_restore_x86_hvm.c
>>> index 6f5af0e..06177e0 100644
>>> --- a/tools/libxc/xc_sr_restore_x86_hvm.c
>>> +++ b/tools/libxc/xc_sr_restore_x86_hvm.c
>>> @@ -78,7 +78,8 @@ static int handle_hvm_params(struct xc_sr_context
>>> *ctx,
>>>               break;
>>>           case HVM_PARAM_IOREQ_PFN:
>>>           case HVM_PARAM_BUFIOREQ_PFN:
>>> -            xc_clear_domain_page(xch, ctx->domid, entry->value);
>>> +            if ( !ctx->restore.buffer_all_records )
>>> +                xc_clear_domain_page(xch, ctx->domid, entry->value);
>>>               break;
>>>           }
>>>
>>
>> .
>>
>