From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yang Hongyang Subject: Re: [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore Date: Fri, 15 May 2015 17:34:56 +0800 Message-ID: <5555BDC0.6010801@cn.fujitsu.com> References: <1431597974-15624-1-git-send-email-yanghy@cn.fujitsu.com> <1431597974-15624-3-git-send-email-yanghy@cn.fujitsu.com> <1431608716.13579.78.camel@citrix.com> <55554CAD.6060206@cn.fujitsu.com> <1431680998.8943.22.camel@citrix.com> <5555BA0A.9020602@cn.fujitsu.com> <1431682049.8943.35.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1431682049.8943.35.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: wei.liu2@citrix.com, eddie.dong@intel.com, wency@cn.fujitsu.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, guijianfeng@cn.fujitsu.com, rshriram@cs.ubc.ca List-Id: xen-devel@lists.xenproject.org On 05/15/2015 05:27 PM, Ian Campbell wrote: > On Fri, 2015-05-15 at 17:19 +0800, Yang Hongyang wrote: >> >> On 05/15/2015 05:09 PM, Ian Campbell wrote: >>> On Fri, 2015-05-15 at 09:32 +0800, Yang Hongyang wrote: >>>> >>>> On 05/14/2015 09:05 PM, Ian Campbell wrote: >>>>> On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote: >>>>>> With Remus, the restore flow should be: >>>>>> the first full migration stream -> { periodically restore stream } >>>>>> >>>>>> Signed-off-by: Yang Hongyang >>>>>> Signed-off-by: Andrew Cooper >>>>>> CC: Ian Campbell >>>>>> CC: Ian Jackson >>>>>> CC: Wei Liu >>>>>> --- >>>>>> tools/libxc/xc_sr_common.h | 14 ++++++ >>>>>> tools/libxc/xc_sr_restore.c | 113 ++++++++++++++++++++++++++++++++++++++++---- >>>>>> 2 files changed, 117 insertions(+), 10 deletions(-) >>>>>> >>>>>> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h >>>>>> index f8121e7..3bf27f1 100644 >>>>>> --- a/tools/libxc/xc_sr_common.h >>>>>> +++ b/tools/libxc/xc_sr_common.h >>>>>> @@ -208,6 +208,20 @@ struct xc_sr_context >>>>>> /* Plain VM, or checkpoints over time. */ >>>>>> bool checkpointed; >>>>>> >>>>>> + /* Currently buffering records between a checkpoint */ >>>>>> + bool buffer_all_records; >>>>>> + >>>>>> +/* >>>>>> + * With Remus, we buffer the records sent by the primary at checkpoint, >>>>>> + * in case the primary will fail, we can recover from the last >>>>>> + * checkpoint state. >>>>>> + * This should be enough because primary only send dirty pages at >>>>>> + * checkpoint. >>>>> >>>>> I'm not sure how it then follows that 1024 buffers is guaranteed to be >>>>> enough, unless there is something on the sending side arranging it to be >>>>> so? >>>> >>>> There are only few records at every checkpoint in my test, mostly under 10, >>>> probably because I don't do much operations in the Guest. I thought This limit >>>> can be adjusted later by further testing. >>> >>> For some reason I thought these buffers included the page data, is that >>> not true? I was expecting the bulk of the records to be dirty page data. >> >> The page data is not stored in this buffer, but it's pointer stored in >> this buffer(rec->data). This buffer is the bulk of the struct xc_sr_record. > > OK, so there are (approximately) as many xc_sr_records as there are > buffered dirty pages? I'd expect this would easily reach 1024 in some > circumstances (e..g run a fork bomb in the domain or something) No, a record may contain up to 1024 pages, so the record number is less than dirty page number. > >>>> Since you and Andy both have doubts on this, I have to reconsider on this, >>>> perhaps there should be no limit. Even if the 1024 limit works for >>>> most of the cases, there might be cases that exceed the limit. So I will >>>> add another member 'allocated_rec_num' in the context, when the >>>> 'buffered_rec_num' exceed the 'allocated_rec_num', I will reallocate the buffer. >>>> The initial buffer size will be 1024 records which will work for most cases. >>> >>> That seems easy enough to be worth doing even if I was wrong about paged >>> data. >> >> done. >> >>> >>> >>> . >>> >> > > > . > -- Thanks, Yang.