From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore Date: Fri, 15 May 2015 10:43:34 +0100 Message-ID: <1431683014.8943.48.camel@citrix.com> References: <1431597974-15624-1-git-send-email-yanghy@cn.fujitsu.com> <1431597974-15624-3-git-send-email-yanghy@cn.fujitsu.com> <1431608716.13579.78.camel@citrix.com> <55554CAD.6060206@cn.fujitsu.com> <1431680998.8943.22.camel@citrix.com> <5555BA0A.9020602@cn.fujitsu.com> <1431682049.8943.35.camel@citrix.com> <5555BDC0.6010801@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5555BDC0.6010801@cn.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Yang Hongyang Cc: wei.liu2@citrix.com, eddie.dong@intel.com, wency@cn.fujitsu.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, guijianfeng@cn.fujitsu.com, rshriram@cs.ubc.ca List-Id: xen-devel@lists.xenproject.org On Fri, 2015-05-15 at 17:34 +0800, Yang Hongyang wrote: > > On 05/15/2015 05:27 PM, Ian Campbell wrote: > > On Fri, 2015-05-15 at 17:19 +0800, Yang Hongyang wrote: > >> > >> On 05/15/2015 05:09 PM, Ian Campbell wrote: > >>> On Fri, 2015-05-15 at 09:32 +0800, Yang Hongyang wrote: > >>>> > >>>> On 05/14/2015 09:05 PM, Ian Campbell wrote: > >>>>> On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote: > >>>>>> With Remus, the restore flow should be: > >>>>>> the first full migration stream -> { periodically restore stream } > >>>>>> > >>>>>> Signed-off-by: Yang Hongyang > >>>>>> Signed-off-by: Andrew Cooper > >>>>>> CC: Ian Campbell > >>>>>> CC: Ian Jackson > >>>>>> CC: Wei Liu > >>>>>> --- > >>>>>> tools/libxc/xc_sr_common.h | 14 ++++++ > >>>>>> tools/libxc/xc_sr_restore.c | 113 ++++++++++++++++++++++++++++++++++++++++---- > >>>>>> 2 files changed, 117 insertions(+), 10 deletions(-) > >>>>>> > >>>>>> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h > >>>>>> index f8121e7..3bf27f1 100644 > >>>>>> --- a/tools/libxc/xc_sr_common.h > >>>>>> +++ b/tools/libxc/xc_sr_common.h > >>>>>> @@ -208,6 +208,20 @@ struct xc_sr_context > >>>>>> /* Plain VM, or checkpoints over time. */ > >>>>>> bool checkpointed; > >>>>>> > >>>>>> + /* Currently buffering records between a checkpoint */ > >>>>>> + bool buffer_all_records; > >>>>>> + > >>>>>> +/* > >>>>>> + * With Remus, we buffer the records sent by the primary at checkpoint, > >>>>>> + * in case the primary will fail, we can recover from the last > >>>>>> + * checkpoint state. > >>>>>> + * This should be enough because primary only send dirty pages at > >>>>>> + * checkpoint. > >>>>> > >>>>> I'm not sure how it then follows that 1024 buffers is guaranteed to be > >>>>> enough, unless there is something on the sending side arranging it to be > >>>>> so? > >>>> > >>>> There are only few records at every checkpoint in my test, mostly under 10, > >>>> probably because I don't do much operations in the Guest. I thought This limit > >>>> can be adjusted later by further testing. > >>> > >>> For some reason I thought these buffers included the page data, is that > >>> not true? I was expecting the bulk of the records to be dirty page data. > >> > >> The page data is not stored in this buffer, but it's pointer stored in > >> this buffer(rec->data). This buffer is the bulk of the struct xc_sr_record. > > > > OK, so there are (approximately) as many xc_sr_records as there are > > buffered dirty pages? I'd expect this would easily reach 1024 in some > > circumstances (e..g run a fork bomb in the domain or something) > > No, a record may contain up to 1024 pages, so the record number is less > than dirty page number. OK so 1024 records equates to ... .... around 4GB of actual data at most (but I suppose not all recs will use the full 1024 pages). I suppose a guest would be working quite hard to dirty that without retriggering a checkpoint (even with colo's more relaxed approach to resync). In any case, making the array grow is clearly a good thing to do and you've already done it, so no need to keep thinking about this case ;-)