From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hongyang Yang Subject: Re: [PATCH Remus v2 00/10] Remus support for Migration-v2 Date: Mon, 11 May 2015 18:48:38 +0800 Message-ID: <55508906.1010003@cn.fujitsu.com> References: <1431077610-3366-1-git-send-email-yanghy@cn.fujitsu.com> <554CFC90.1030301@citrix.com> <55504C20.8020009@cn.fujitsu.com> <55506F94.7070407@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55506F94.7070407@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , xen-devel@lists.xen.org Cc: wei.liu2@citrix.com, ian.campbell@citrix.com, wency@cn.fujitsu.com, ian.jackson@eu.citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, rshriram@cs.ubc.ca List-Id: xen-devel@lists.xenproject.org On 05/11/2015 05:00 PM, Andrew Cooper wrote: > On 11/05/15 07:28, Hongyang Yang wrote: >> On 05/09/2015 02:12 AM, Andrew Cooper wrote: >>> On 08/05/15 10:33, Yang Hongyang wrote: >>>> This patchset implement the Remus support for Migration v2 but without >>>> memory compressing. [...] > >> >> >> >>> end_of_checkpoint() >>> Checkpoint record >> >> ctx->save.callbacks->postcopy() >> this callback should not be omitted, it do some necessary work before >> resume >> primary (such as call Remus devices preresume callbacks to ensure the >> disk >> data is consistent) and then resume the primary guest. I think this >> callback should be renamed to ctx->save.callbacks->resume(). > > That looks to be a useful cleanup (and answers one of my questions of > what exactly postcopy was) > >> >>> ctx->save.callbacks->checkpoint() >>> libxl qemu record >> >> Maybe we should add another callback to send qemu record instead of >> using checkpoint callback. We can call it >> ctx->save.callbacks->save_qemu() > > This is another layering violation. libxc should not prescribe what > libxl might or might not do. One example we are experimenting with in > XenServer at the moment is support for multiple emulators attached to a > single domain, which would necessitate two LIBXL_EMULATOR records to be > sent per checkpoint. libxl might also want to send an updated json blob > or such. Ok, so we'd better not introduce save_qemu callback. > >> Then in checkpoint callback, we only call remus devices commit callbacks( >> which will release the network buffer etc...) then decide whether we >> need to >> do another checkpoint or quit checkpointed stream. >> With Remus, checkpoint callback only wait for 200ms(can be specified >> by -i) >> then return. >> With COLO, checkpoint callback will ask COLO proxy if we need to do a >> checkpoint, will return when COLO proxy module indicate a checkpoint >> is needed. > > That sounds like COLO wants a should_checkpoint() callback which > separates the decision to make a checkpoint from the logic of > implementing a checkpoint. We use checkpoint callback to do should_checkpoint() thing currently. libxc will check the return value of checkpoint callback. > >> >>> ... >>> libxl end-of-checkpoint record >>> ctx->save.callbacks->checkpoint() returns >>> start_of_checkpoint() >> >> ctx->save.callbacks->suspend() >> >>> >>> end_of_checkpoint() >>> Checkpoint record >>> etc... >>> >>> This will eventually allow both libxc and libxl to send checkpoint data >>> (and by the looks of it, remove the need for postcopy()). With this >>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the >>> current qemu situation, but I would prefer not to be also retrofitting >>> libxc checkpoint records when doing the libxl/migv2 work. >>> >>> Does this look plausible in for Remus (and eventually COLO) support? >> >> With comments above, I would suggest the save flow as below: >> >> libxc writes: libxl writes: >> >> live migration: >> Image Header >> Domain Header >> start_of_stream() >> start_of_checkpoint() >> >> ctx->save.callbacks->suspend() >> >> end_of_checkpoint() >> if ( checkpointd ) >> End of Checkpoint record >> /*If resotre side receives this record, input fd should be handed to >> libxl*/ >> else >> goto end >> >> loop of checkpointed stream: >> ctx->save.callbacks->resume() >> ctx->save.callbacks->save_qemu() >> libxl qemu record >> ... >> libxl end-of-checkpoint record >> /*If resotre side receives this record, input fd should be handed to >> libxc*/ >> ctx->save.callbacks->save_qemu() returns >> ctx->save.callbacks->checkpoint() >> start_of_checkpoint() >> ctx->save.callbacks->suspend() >> >> end_of_checkpoint() >> End of Checkpoint record >> goto 'loop of checkpointed stream' >> >> end: >> END record >> /*If resotre side receives this record, input fd should be handed to >> libxl*/ >> >> >> In order to keep it simple, we can keep the current >> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu >> thing, call >> Remus devices commit callbacks and then decide whether we need a >> checkpoint. We >> can also combine the ctx->save.callbacks->resume() with >> ctx->save.callbacks->checkpoint(), with only one checkpoint() >> callback, we do >> the following things: >> - Call Remus devices preresume callbacks >> - Resume the primary >> - Save qemu records >> - Call Remus devices commit callbacks >> - Decide whether we need a checkpoint >> >> Overall, there are 3 options for the save flow: >> 1. keep the current callbacks, rename postcopy() to resume() >> 2. split the checkpoint() callback to save_qemu() and checkpoint() >> 3. combine the current postcopy() and checkpoint() >> Which one do you think is the best? > > I have a 4th alternative in mind, but would like your feedback from my > comments in this email first. So what's the 4th alternative? > > ~Andrew > . > -- Thanks, Yang.