From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yang Hongyang Subject: Re: [PATCH Remus v5 1/2] libxc/save: implement Remus checkpointed save Date: Fri, 15 May 2015 17:40:15 +0800 Message-ID: <5555BEFF.106@cn.fujitsu.com> References: <1431597974-15624-1-git-send-email-yanghy@cn.fujitsu.com> <1431597974-15624-2-git-send-email-yanghy@cn.fujitsu.com> <1431607637.13579.73.camel@citrix.com> <55555601.90409@cn.fujitsu.com> <5555AE4C.5070908@citrix.com> <5555B031.8010306@cn.fujitsu.com> <5555BD00.6060501@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5555BD00.6060501@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , Ian Campbell Cc: wei.liu2@citrix.com, wency@cn.fujitsu.com, ian.jackson@eu.citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, xen-devel@lists.xen.org, guijianfeng@cn.fujitsu.com, rshriram@cs.ubc.ca List-Id: xen-devel@lists.xenproject.org On 05/15/2015 05:31 PM, Andrew Cooper wrote: > On 15/05/15 09:37, Yang Hongyang wrote: >> >> >> On 05/15/2015 04:29 PM, Andrew Cooper wrote: >>> On 15/05/2015 03:12, Yang Hongyang wrote: >>>> >>>>>> @@ -467,6 +477,14 @@ static int send_domain_memory_live(struct >>>>>> xc_sr_context *ctx) >>>>>> DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap, >>>>>> &ctx->save.dirty_bitmap_hbuf); >>>>>> >>>>>> + /* >>>>>> + * With Remus, we will enter checkpointed save after live >>>>>> migration. >>>>>> + * In checkpointed save loop, we skip the live part and pause >>>>>> straight >>>>>> + * away to send dirty pages between checkpoints. >>>>>> + */ >>>>>> + if ( !ctx->save.live ) >>>>>> + goto last_iter; >>>>> >>>>> Rather than use goto would it work to refactor everything from here to >>>>> the label into some sort of helper and just call that in the "actually >>>>> live" case? >>>>> >>>>> Or perhaps everything from the label to the end should be a helper >>>>> function which the caller can also use in thecheckpoint case >>>>> instead of >>>>> calling send_domain_memory_live (and which s_d_m_l also calls of >>>>> course). >>>> >>>> I'm going to refactor the send_domain_memory_live() as follows: >>>> >>>> split the send_domain_memory_live() into three helper function: >>>> - send_memory_live() do the actually live case >>>> - suspend_and_send_dirty() suspend the guest and send dirty pages >>>> - send_memory_verify() >>>> >>>> then: >>>> - send_domain_memory_live() combination of those three helper >>>> functions >>>> - send_domain_momory_checkpointed() calls >>>> suspend_and_send_dirty() and >>>> send_memory_verify() >>>> - send_domain_memory_nonlive() stay as it is >>>> Does it make sense? >>> >>> verify mode is only a debugging tool, and shouldn't be used in general. >>> (It is actually a lot less useful in retrospect than one would imagine.) >> >> Do you mean we do not need this in checkpointed stream? >> Currently, tt is conditioned under 'if ( ctx->save.debug )', both in >> live migration and checkpointed stream. > > Ah - you absolutely don't want it in a checkpointed stream. As soon as > that option has been set, the restoring side can no longer resume the > domain. So it needs to be conditioned under if ( ctx->save.debug && !ctx->save.checkpointed ) if it is a checkpointed stream, we just ignore this flag, is it OK? > > Basically, it tells the restore side to start doing memcmp() instead of > memcpy() on the incoming page data. The initial hope was to detect > memory corruption, but it turns out that there is natural memory > corruption to be had anyway when PV drivers are in use. > > (This is another area which is around for compatibility with legacy, > rather than actually of being much use in practice) > > ~Andrew > . > -- Thanks, Yang.