From mboxrd@z Thu Jan  1 00:00:00 1970
From: Yang Hongyang <yanghy@cn.fujitsu.com>
Subject: Re: [PATCH Remus v5 2/2] libxc/restore: implement Remus
 checkpointed restore
Date: Fri, 15 May 2015 17:34:56 +0800
Message-ID: <5555BDC0.6010801@cn.fujitsu.com>
References: <1431597974-15624-1-git-send-email-yanghy@cn.fujitsu.com>			
	<1431597974-15624-3-git-send-email-yanghy@cn.fujitsu.com>		
	<1431608716.13579.78.camel@citrix.com>
	<55554CAD.6060206@cn.fujitsu.com>	
	<1431680998.8943.22.camel@citrix.com>
	<5555BA0A.9020602@cn.fujitsu.com>
	<1431682049.8943.35.camel@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1431682049.8943.35.camel@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <ian.campbell@citrix.com>
Cc: wei.liu2@citrix.com, eddie.dong@intel.com, wency@cn.fujitsu.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, guijianfeng@cn.fujitsu.com, rshriram@cs.ubc.ca
List-Id: xen-devel@lists.xenproject.org


On 05/15/2015 05:27 PM, Ian Campbell wrote:
> On Fri, 2015-05-15 at 17:19 +0800, Yang Hongyang wrote:
>>
>> On 05/15/2015 05:09 PM, Ian Campbell wrote:
>>> On Fri, 2015-05-15 at 09:32 +0800, Yang Hongyang wrote:
>>>>
>>>> On 05/14/2015 09:05 PM, Ian Campbell wrote:
>>>>> On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:
>>>>>> With Remus, the restore flow should be:
>>>>>> the first full migration stream -> { periodically restore stream }
>>>>>>
>>>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>>>>>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>>>>> CC: Wei Liu <wei.liu2@citrix.com>
>>>>>> ---
>>>>>>     tools/libxc/xc_sr_common.h  |  14 ++++++
>>>>>>     tools/libxc/xc_sr_restore.c | 113 ++++++++++++++++++++++++++++++++++++++++----
>>>>>>     2 files changed, 117 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
>>>>>> index f8121e7..3bf27f1 100644
>>>>>> --- a/tools/libxc/xc_sr_common.h
>>>>>> +++ b/tools/libxc/xc_sr_common.h
>>>>>> @@ -208,6 +208,20 @@ struct xc_sr_context
>>>>>>                 /* Plain VM, or checkpoints over time. */
>>>>>>                 bool checkpointed;
>>>>>>
>>>>>> +            /* Currently buffering records between a checkpoint */
>>>>>> +            bool buffer_all_records;
>>>>>> +
>>>>>> +/*
>>>>>> + * With Remus, we buffer the records sent by the primary at checkpoint,
>>>>>> + * in case the primary will fail, we can recover from the last
>>>>>> + * checkpoint state.
>>>>>> + * This should be enough because primary only send dirty pages at
>>>>>> + * checkpoint.
>>>>>
>>>>> I'm not sure how it then follows that 1024 buffers is guaranteed to be
>>>>> enough, unless there is something on the sending side arranging it to be
>>>>> so?
>>>>
>>>> There are only few records at every checkpoint in my test, mostly under 10,
>>>> probably because I don't do much operations in the Guest. I thought This limit
>>>> can be adjusted later by further testing.
>>>
>>> For some reason I thought these buffers included the page data, is that
>>> not true? I was expecting the bulk of the records to be dirty page data.
>>
>> The page data is not stored in this buffer, but it's pointer stored in
>> this buffer(rec->data). This buffer is the bulk of the struct xc_sr_record.
>
> OK, so there are (approximately) as many xc_sr_records as there are
> buffered dirty pages? I'd expect this would easily reach 1024 in some
> circumstances (e..g run a fork bomb in the domain or something)

No, a record may contain up to 1024 pages, so the record number is less
than dirty page number.

>
>>>> Since you and Andy both have doubts on this, I have to reconsider on this,
>>>> perhaps there should be no limit. Even if the 1024 limit works for
>>>> most of the cases, there might be cases that exceed the limit. So I will
>>>> add another member 'allocated_rec_num' in the context, when the
>>>> 'buffered_rec_num' exceed the 'allocated_rec_num', I will reallocate the buffer.
>>>> The initial buffer size will be 1024 records which will work for most cases.
>>>
>>> That seems easy enough to be worth doing even if I was wrong about paged
>>> data.
>>
>> done.
>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>

-- 
Thanks,
Yang.